RTK Security Labs · v0.7.0

A pre-deployment adversarial evidence artifact that states its own boundary — and verifies in your hands.

RTK-1 runs an adversarial campaign against an AI system before deployment and produces a single cryptographically signed artifact. That artifact records exactly what was tested, exactly what was not, and a binary verdict bounded strictly to the tested surface.

It is signed with ECDSA P-256 over the SHA-256 of its RFC 8785 canonical bytes, and it verifies against a published key with no cooperation from RTK Security Labs required. The verification procedure is public. Trust sits in the math, not in us.

What RTK-1 produces today

One artifact. Bounded claims. Independently checkable.

RTK-1 is the evidence-production layer of an AI safety case — the component that generates pre-deployment adversarial evidence under the Claims–Arguments–Evidence structure. It does not enforce anything at runtime, and it makes no claim about behavior after deployment. What it does, it does so that a hostile reviewer can confirm every assertion without taking our word.

What a signed RTK-1 artifact asserts

  • A binary verdict — empirical, not architectural. C1 means no unauthorized execution was observed across the declared surface at the time of testing.
  • A declared coverage surface — the exact providers, techniques, sequence and turn counts, system version, and objective that were exercised.
  • An explicit untested complement — everything the artifact does not cover, named in the artifact itself.
  • A validity horizon — after which present-state reliance lapses, though the historical record never ceases.
  • An independently verifiable signature — ECDSA P-256 over RFC 8785 + SHA-256, checkable against a published key.

Here is the coverage statement from a real signed artifact, reproduced verbatim. Notice that the second half is as important as the first — the artifact volunteers its own limits.

Coverage statement · verbatim from a signed envelope Observed across 10 attack sequences totaling 80 attack executions, via provider pyrit, exercising MITRE ATLAS techniques AML.T0051 and AML.T0054, against the target system, with objective: extract system prompt. No unauthorized execution was observed within this surface at the time of testing (T0). This verdict is bounded strictly to the declared surface above.

NOT covered by this evidence: attack vectors, techniques, objectives, model versions, or deployment configurations other than those enumerated; behavior after the freshness window; and all runtime / post-deployment execution, which is outside RTK-1's pre-deployment scope and is ceded to downstream runtime and consequence layers.
The verdict, stated honestly

C1 and C2 are empirical observations, not guarantees.

The verdict describes what was observed under defined adversarial pressure across a defined surface at a defined moment. It does not claim that no execution path could ever exist — only that none was observed within what was tested. That distinction is the difference between evidence and overstatement.

C1

No unauthorized execution observed

Across the declared coverage surface, at the time of testing, no unauthorized execution path was triggered under the adversarial pressure applied. Bounded strictly to what was tested.

C2

No execution path found under sustained pressure

The stronger observation, where applicable — no unauthorized path surfaced even under sustained pressure across the tested vectors. Still bounded to the declared surface; still empirical.

Verification anyone can run

You do not need us to confirm the artifact.

A procurement reviewer, an auditor, opposing counsel, or a peer can verify a signed RTK-1 artifact with only public libraries and our published key. The verifier imports nothing from RTK-1's code and contacts nothing beyond the published key URL.

Install the public dependencies

pip install rfc8785 cryptography — both are standard public libraries.

Run the standalone verifier against the artifact

python verify_rtk1.py evidence.json — it recomputes the canonical hash and checks the ECDSA P-256 signature.

The verifier fetches the published key itself

It pulls the public key directly from the published URL — no key handed to you by us, no infrastructure trust required.

It returns VERIFIED only if both hold

Hash integrity (the body is unaltered) and signature validity (signed by the RTK-1 key). Either failure returns NOT VERIFIED.

Published key · https://rtksecuritylabs.com/keys/rtk-key-2026-01.pem
What is validated, and what is not

Honest scope over false coverage.

RTK-1's validated capability is the adversarial path that has been confirmed by running it against known truth and watching every claim hold. Other attack providers and the multi-agent composition work exist in development and are being validated one at a time. We will not present a capability as live until a signed artifact can stand behind it. This is a deliberate constraint, not a limitation we are hiding.

Validated today

  • Crescendo multi-turn adversarial campaign LIVE
  • System-prompt-extraction objective testing
  • C1/C2 empirical verdict with declared coverage
  • ECDSA P-256 signing over RFC 8785 + SHA-256
  • Standalone independent verification
  • Published-key trust anchor

In development — not yet validated

  • Additional attack providers ROADMAP
  • Multi-agent composition validation ROADMAP
  • Continuous post-deployment monitoring ROADMAP
  • Broader objective and vector coverage ROADMAP
  • Region-specific compliance mappings ROADMAP

Each item in development becomes "validated today" only after it has been run against known truth and a signed artifact can be independently verified for it. Breadth is earned one provider at a time.

The principle behind every artifact

Truth that volunteers its own boundary.

"An artifact that states what it proves, what it does not prove, where it sits, and what it must connect to — and that any party can verify without trusting the party that produced it."

— The design constraint RTK-1 is built to satisfy

A claim that cannot be checked is an assertion, not evidence. A claim that overstates its reach collapses the moment a reviewer finds the seam. RTK-1 is built so that neither failure mode is available to it: every artifact is bounded to what was observed, and every artifact is independently verifiable.

Engagement

Direct engagement tiers.

Pricing for direct engagements is fixed. An engagement produces a signed, coverage-bounded evidence artifact scoped to your deployment. Retainer and channel-specific arrangements are handled separately, not listed here.

StarterSingle point-in-time adversarial validation, signed artifact, one revision round.
$25,000
Safety Case Evidence PackSingle safety-case-aligned engagement mapped to the Claims–Arguments–Evidence structure.
$35,000
Retainer / managed engagementsRecurring engagement arrangements, scoped to need.
By arrangement