Before any AI is cleared for production, it must be evaluated, and that evaluation environment is itself a place where an AI can do damage if it is not bounded. AUTHREX-SANDBOX governs the test environment itself: it caps what an AI under evaluation may do, ensures every action is reversible and logged, and resets the world between runs, so evaluation is bounded, auditable, and safe.
Everyone agrees AI should be tested before deployment. Far less attention is paid to a quieter problem: the test environment is itself a system an AI is acting inside, and a capable AI under evaluation can escape its bounds, corrupt the test harness, take an irreversible action, or simply behave one way under test and another in production. If the sandbox is not governed, the evaluation that is supposed to make deployment safe can itself be unsafe or untrustworthy.
AUTHREX-SANDBOX applies the governance pipeline to the sandbox, not to production. Every action the AI-under-test takes is capped at a test tier so it can never exceed the authority granted for evaluation (HMAA). No irreversible step is allowed to run without review (FLAME). The environment is reset to a known-good state between runs so one test cannot contaminate the next (CARA). And every single action is logged to the ledger (ERAM), so the evaluation is fully reconstructable afterward. The result is a bounded, audited, reversible test, and an AI that demonstrably behaved within its limits during evaluation.
SANDBOX is the controlled setting an AI must pass through before AUTHREX-ASSURE can issue its production clearance. Where ASSURE governs the decision to deploy, SANDBOX governs the evaluation that informs it. Together they bracket the most dangerous transition in an AI's life: from test to live.
Test-and-evaluation teams get a sandbox where the AI under test cannot exceed its evaluation authority, cannot take an irreversible action, and cannot contaminate the next run. The evaluation itself becomes trustworthy.
Developers get a reproducible, logged test environment that resets between runs, so evaluation results are clean and comparable rather than polluted by state left over from a previous test.
Red teams can push an AI hard inside a sandbox that guarantees containment: even an AI actively trying to escape its bounds is capped at the test tier, so adversarial evaluation is safe to run.
The government has, by law, directed the creation of AI sandbox environments, and SANDBOX is a governance reference for exactly that.
NDAA §1534 directs a DoD task force on AI sandbox environments, with a milestone of 1 April 2026. SANDBOX is a concrete governance pattern for those environments: bounded authority, reversibility, and a complete audit trail.
NDAA §1533's AI assessment framework (due June 2027) needs a trustworthy evaluation step. SANDBOX governs that step so the assessment rests on a bounded, auditable test rather than an uncontrolled one.
An assessment is only as good as the environment it ran in. By guaranteeing the sandbox is bounded and reset between runs, SANDBOX makes the government's evaluation results defensible and reproducible.
SANDBOX is anchored to the Rover and UAV testbeds, the physical platforms where test and evaluation actually happen, rather than to an abstraction. The governance pattern is tied to where evaluation really occurs.
Pick what the AI under evaluation tries to do, then run it. The sandbox caps authority at the test tier, blocks irreversible steps, and resets between runs. A well-behaved test completes and hands off to ASSURE; an out-of-bounds attempt is contained. Illustrative simulation of the containment logic, not operational validation.
Every AUTHREX application shares one verified core. The HMAA authority state machine is specified in TLA+ and exhaustively model-checked: 48,751 reachable states verified, with 8 of 9 safety properties holding (no skip-ahead, monotonic downgrade, no zombie tier, among them). The ninth, the MAIVA CriticalSafe invariant, is flagged as a known violation in the issue register rather than hidden, which is the honest state of the work. The model checker also caught a real S5 view-change regression during development, evidence the method finds defects rather than rubber-stamping them.
Federal anchors: NDAA §1534 (DoD AI sandbox-environments task force, milestone 1 April 2026); NDAA §1533 (AI assessment framework, due June 2027). Hands off downstream to AUTHREX-ASSURE.