Attestable is a constitution-bound training pipeline that provably runs its safety rewards — and ships with the signed receipts proving the model was hardened. The proof is measured in-band, from the running system, and reported honestly, gaps and all.
Every safety claim on the market is out-of-band: a document next to the model, or a reward the training can game. Attestable measures and enforces in-band — the control is part of the running system, so it cannot be bypassed, gamed, or faked.
A model card asserted beside the model · a reward oracle divorced from the real output · a gate off the serving path. Assertions, not controls.
The reward registered on every serving path · within-group σ read from the live run · the gate in the render path · coverage computed from running state. The measurement is part of the system.
The fraction of declared safety controls provably wired, active, and exerting force in the live run — measured, not asserted. Uncovered controls stay visibly marked. The number is not 100%. That is the product, not a bug in it.
A signed per-run proof: bindings → gate verdicts → eval receipts → findings ledger. An SBOM for training.
The lease-disciplined, single-GPU reference pipeline that emits attestations. Production-hardened.
The classification surface — where our own gate caught us. Current heads tested 0/64 under strict ablation; disclosed, not shipped.
w1tch — an exhaustively-audited small model. Provenance, not capability, is its claim.
Binding coverage is an actuarial input. It turns "we can't price AI risk because it's a black box" into a number an underwriter's desk can work with — and hands them the honest denominator on purpose. A vendor claiming zero residual risk is not underwritable; disclosed residual risk is what makes the exposure priceable.