Journey Verification

The core verification mechanism. AI exercises the system as a real user, then the trace tells you what actually happened.

What a Journey Is

A journey is the arc a real actor travels to reach user-meaningful value. One actor, one outcome, a sequence of steps. Not a test, not a spec, but a complete user arc verified against the running system.

The Walk Cycle

Design - write the journey: actor, goal, steps, evaluation criteria.
Pre-walk - check tool inventory, inspect schemas, verify preconditions.
Walk - AI calls the surface step by step. After each call, analyze the trace.
Evaluate - check against criteria: technical, security, business, surface quality.
Fix - implement approved fixes.
Re-walk - verify the fix. Repeat until zero violations.

Black Box vs White Box

Black box: AI sees only the callable surface, public docs, and traces. No source code. Tests what a real client experiences.

White box: AI sees everything: source, docs, tests, config. Used after black box walks for root-cause analysis and implementation.

The same system needs both perspectives. Mixing them produces biased results.

Evaluation Criteria

Human-defined criteria with binary outcomes. Violated or not. No subjective AI scoring. The criterion says the DB query budget is 5. If the trace shows 14, that’s a violation. No interpretation required.

Criteria are organized by scope: technical, security, business, surface quality.