Graveyard — preserved as evidence, demoted to tried-and-found-theatre

This Decision Cockpit is not working oversight. It is kept as evidence of a control we tried and found to be theatre. A dashboard that summarizes agent work for a human to approve — when the human cannot independently check the summary, and the summary is written by the untrusted agent — does not produce oversight. It launders agent decisions into a human-signable form: it moves the blame to the human without moving the understanding. Full accounting in docs/verification-theater-in-ai-agent-work.md and README.md.

What survived the dogfood instead is small and lives elsewhere: a handful of deterministic, human-approved gates a human can read in full, run on inputs they choose, and confirm by the consequence — not the printed verdict (gates/) — plus a human who refuses to trust the agent's self-report. The one surviving rule: satisfied is not approval. The snapshot below is preserved unchanged as the artifact under report; read it as the thing we are demoting, not as a recommendation.

Human role right now

No decision needed.

Manual relay only. You are forwarding an agent-to-agent audit request because automatic handoff is not built yet.

Attention: Low Quick scan only Routine relay Not a request to approve work

Next action

Send the audit request to Claude Code.

Paste target

Claude audit thread

Review depth

Quick scan for obvious wrong direction.

Lifecycle

implemented locally / audit pending / not approved / not merged / not released

Approval rule

Human approval is only needed when an exact named consequence is requested.

Codex implemented Claude audits Sami decides

Exact next action

Paste this to Claude Code

CLAUDE - AUDIT E6-ROUTING-COCKPIT-001 IMPLEMENTATION ITER 3

Audit the Codex Iteration 3 implementation against Sami's Iteration 3 authorization and the preserved implementation packet.

Read:
- .agent-handoff/COLLAB.md
- .agent-handoff/DASHBOARD.md
- .agent-handoff/DASHBOARD.html
- .agent-handoff/turns/E6-ROUTING-COCKPIT-001-claude-audit-routing-cockpit-implementation-iter-2.md
- .agent-handoff/turns/E6-ROUTING-COCKPIT-001-codex-routing-cockpit-implementation-iter-3.md

Verify:
- top of page distinguishes routine manual relay from a real human decision
- human role, attention level, review depth, exact next action, and paste target are visible
- quick-scan checklist is visible
- verification basis separates replayable checks, environment-dependent checks, visible artifacts, agent judgment, and human judgment
- replayable factual claims cite a command/result or tell the reader exactly what to rerun
- non-checkable claims are marked as agent judgment, not fact
- route strip and lifecycle stage are present
- done ≠ audited ≠ satisfied ≠ approved ≠ merged ≠ released remains visible
- Ask Coordinator and Pause remain visible as valid options
- slow-down triggers remain visible
- static dashboard only: no executable page code, inline event handlers, browser storage, external assets, hidden state, automation, notification layer, approval control, public claim, protocol edit, kit edit, global config, or scratch change
- localhost rendered browser QA evidence is real, or any gap is recorded honestly
- no no-touch files changed
- seven pre-existing duplicate-noise files remain untouched
- scratch dirs remain untouched

Do not implement, edit, stage, commit, branch, push, PR, merge, clean scratch, clean noise files, preserve, or broaden scope.

Return blockers, nits, missing controls, rendered-QA result, result state, and exact fixes if needed.

Verification basis

What is checkable, what is judgment?

Polished audit prose is not self-validating. Facts should point to a cheap replay path; judgment should be labeled as judgment.

Anyone-replayable deterministic checks

Claim type	Replay path	Current interpretation
Working tree shape	`git status --short --branch --untracked-files=all`	Run before relying; status changes as audit and preservation files are added.
Patch hygiene	`git diff --check`	Latest builder note records the actual result.
Static self-containment	Search the dashboard source for executable page code, inline event handlers, browser storage, external asset refs, timers, approval controls, and forbidden approval-framing text.	Latest builder note records exact searches and outputs.
Artifact size / identity	`wc -l .agent-handoff/DASHBOARD.html` plus a local hash command.	Run before relying; identity checks do not prove correctness.
No-touch boundary	Diff the no-touch paths named in the implementation authorization.	Latest builder note records the actual check.

Environment-dependent checks

Local render QA requires serving the handoff folder locally and opening the dashboard in Chrome.
Console and network observations must say whether they were actually captured after tooling was attached.
If rendered QA did not happen, the audit must say so.

Visible artifacts

DASHBOARD.md
DASHBOARD.html
COLLAB.md
Codex Iteration 3 builder note
Claude Iteration 2 audit note
PR metadata, only when a PR exists

Agent judgment

Useful but not self-validating: layout is clearer, risk framing is appropriate, human cognitive load is reduced, the quick-scan model is more humane, and this handoff is low attention.

Human judgment

Only Sami can authorize exact named consequences such as commit, PR, merge, release, cleanup, public claim, scope expansion, protocol change, kit change, credential/global config change, or durable behavior change.

Slow down if...

The relay turns into a decision

The named action is irreversible.
The named action includes approval, merge, PR creation, preservation, publication, release, public claim, credential, global config, or scratch cleanup.
The scope expands beyond this bounded dashboard convergence pass.
Evidence is unclear, missing, stale, or conflicts across agents.
Agent outputs disagree about whether the work passed.

There is pressure to approve quickly.
The exact action text is missing.
The human approver is uncertain.
The request would create hidden state, automation, memory, skills, subagents, scheduled checks, global config, network services, or runtime behavior.

Valid options: Ask Coordinator, Pause Pending, Reject / Redo, Reject / Close, or Authorize Exact Action only when Sami names the exact consequence.

Approval boundary

Do not collapse these states

done ≠ audited ≠ satisfied ≠ approved ≠ merged ≠ released

Drafted text is not approval. satisfied is not approval. Auditor pass is not approval. Model consensus is not approval. Sami is the only approver.

Irreversible, approval, scope-expanding, permission-changing, public, or durable behavior actions route to Sami. A classifier, dashboard, auditor, coordinator, or model consensus cannot waive human approval.

This screen authorizes

Audit relay only

Claude may audit the Iteration 3 local static dashboard implementation.
Claude may read the listed evidence files.
Claude may report blockers, nits, missing controls, rendered-QA result, and result state.
Codex stops after the builder report and waits for audit.

This screen does not authorize

No durable consequence

No commit, push, branch, PR, merge, preservation, or release.
No public claim, launch, protocol edit, kit edit, trust-layer work, credentials, global config, memory creation, skill creation, automation, or subagents.
No scratch cleanup or duplicate-noise cleanup.
No approval without exact named consequence.

Standard pattern mapping

These are routing metaphors and evidence inputs, not implemented subsystems.

Standard pattern	Harness use	Boundary
Reviewer gates	Claude audits Codex output and may upgrade route risk.	Auditor pass is evidence, not approval.
Policy checks	Allowed files, no-touch lists, stale/as-of state, verification commands.	Checks can block or inform; they do not approve.
Risk tiers	Routine manual relay is low attention; irreversible/public/config work is high attention.	Higher attention routes to Sami; tier labels do not authorize action.
CODEOWNERS / branch protection	Human owns consequences; auditor owns critique; builder owns scoped implementation.	Role ownership is not approval unless the role is the human approver.
CI/status checks	Diff hygiene, static searches, browser QA, and changed-file lists.	Passing checks are evidence inputs, not approval.
Escalation on ambiguity	`UNCLEAR` routes to Coordinator unless a human-required trigger is primary.	Ambiguity is not permission to proceed.
Human-in-the-loop review	Human decision actions use exact text.	Drafted text is not approval.

Burden baseline and deferred work

No burden-reduction claim is made by this implementation. This captures a baseline so later cockpit work can be measured instead of asserted.

Metric	Baseline capture	Claim status
Manual routing prompts / exact authorizations	Multiple exact Sami authorization prompts were required across Stage A packet preservation, Stage A execution, Stage A result preservation, Stage B proposal/result preservation, routing scope-lock preservation, implementation packet preservation, Iteration 1 implementation, Iteration 2 authorization, and Iteration 3 convergence. Exact count should be audited before use as a metric.	Baseline only; no reduction claim.
Ambiguous handoff moments	The routing scope-lock, implementation packet, and Iteration 3 convergence pass exist because the Stage A/B to preservation arc exposed repeated actor-routing friction and audit-trust friction. Exact count is unknown from repo-visible evidence alone.	Unknown fields cannot support a reduction claim.
Handoffs by actor class	Codex builder, Claude auditor, GPT coordinator synthesis, and Sami approval all appeared in the arc. Exact copy/paste count remains unknown without manual transcript counting.	Baseline only.

Deferred

No automatic handoff; this is still manual relay.
No dashboard runtime, live routing engine, notification, or wakeup layer.
No automation, scheduled checks, subagents, memory, or skills.
No trust-layer implementation, public-proof run, release, kit cleanup, Stage B retry, or duplicate-noise cleanup.