Runs, Snapshots, and Replay

Definitions

Run

A single execution of the assessment engine against all requirements in a pack. A run:

Evaluates uploaded evidence against each requirement
Produces individual assessments with status, confidence, and citations
Records audit information including timing and model details
Captures configuration versions for reproducibility

Run status

Status	Description
PENDING	Run created but not yet started
IN_PROGRESS	Assessment engine is processing requirements
COMPLETED	All requirements assessed successfully
FAILED	Run terminated due to error

Snapshot

A frozen record of a completed run's results and configuration. Snapshots enable deterministic replay by capturing:

All assessment results
Generated tasks
Export data
Configuration versions used

Snapshot status

Status	Description
PENDING	Run not yet completed—no snapshot exists
CAPTURED	Snapshot successfully saved—deterministic replay available
FAILED	Snapshot capture failed—replay uses live data (non-deterministic)
LEGACY	Pre-snapshot-feature run—no snapshot exists

Replay

Re-executing or re-viewing a past run's results. Two types:

Deterministic replay: Uses captured snapshot data—guaranteed identical results
Live replay: Re-queries current data—results may differ if configuration changed

Assessment

The result of evaluating one requirement during a run. See Evidence, Requirements, and Assessments for details.

Invariants

These conditions must always be true:

Only one run IN_PROGRESS per tenant at a time: Concurrent runs are prevented to avoid resource contention.
Run results are immutable after completion: Once a run reaches COMPLETED or FAILED status, its data cannot be modified.
Snapshots are captured on successful completion: When a run completes successfully, the system captures an immutable snapshot.
FAILED snapshot runs cannot use live fallback: If snapshot capture failed, replay returns an error rather than potentially inconsistent data.
Version IDs are pinned at run creation: The run records which criteria, corpus, and engine versions were active when it started.
Assessments belong to exactly one run: Each assessment record references a single run ID.

How it shows up in the UI

Portal interface (`/cm`)

Run button: Starts a new assessment run
Progress indicator: Shows current run status (PENDING, IN_PROGRESS, COMPLETED, FAILED)
Results panel: Displays assessment outcomes for the most recent run
Run selector: (if available) Switch between past runs to view historical results

Admin console (`/admin`)

Run History section:

List of all runs with status badges
Timestamps showing when each run started and completed
Click a run to see detailed information

Run Details view:

Timeline: Visual progression of run stages
Assessment summary: Counts of COMPLETE, PARTIAL, MISSING, FAILED
Snapshot indicator: Shows if deterministic replay is available
Replay button: Re-view the run's results

Trace Viewer (/admin/traces/:runId):

Detailed span-level view of AI operations
Token usage and latency metrics
Request/response payloads for debugging

Status indicators

Icon	Status
Gray circle	PENDING
Spinning loader	IN_PROGRESS
Green checkmark	COMPLETED
Red X	FAILED
Camera icon	Snapshot CAPTURED
Warning triangle	Snapshot FAILED

How it shows up in the API

Start a run

POST /api/runs/start
Content-Type: application/json

Response:

{
  "run": {
    "id": "run-abc-123",
    "tenantId": "tenant-xyz",
    "packId": "project-pack",
    "status": "PENDING",
    "snapshotStatus": "PENDING",
    "createdAt": "2025-01-15T10:00:00Z"
  }
}

Get run details

GET /api/runs/:runId

Response:

{
  "id": "run-abc-123",
  "status": "COMPLETED",
  "snapshotStatus": "CAPTURED",
  "totalRequirements": 70,
  "assessedCount": 70,
  "completeCount": 45,
  "partialCount": 15,
  "missingCount": 10,
  "startedAt": "2025-01-15T10:00:05Z",
  "completedAt": "2025-01-15T10:05:30Z",
  "snapshotCriteriaVersionId": "crit-v2",
  "snapshotCorpusActivationId": "corpus-v1",
  "snapshotRetrievalVersionId": "retr-v3"
}

Get run assessments

GET /api/runs/:runId/assessments

Response:

{
  "assessments": [
    {
      "id": "asmt-001",
      "runId": "run-abc-123",
      "requirementId": "req-access-001",
      "status": "COMPLETE",
      "confidence": 0.94,
      "citations": [...],
      "reasoning": "..."
    }
  ]
}

Replay a run (admin)

GET /api/admin/runs/:runId/replay

Response (with snapshot):

{
  "source": "snapshot",
  "data": {
    "assessments": [...],
    "tasks": [...],
    "exports": [...]
  }
}

Response (without snapshot, legacy run):

{
  "source": "live",
  "warning": "LEGACY run - results from live data, may differ from original",
  "data": {...}
}

Response (failed snapshot):

{
  "error": "REPLAY_UNAVAILABLE",
  "message": "Snapshot capture failed for this run. Deterministic replay is not available."
}

Flow: run lifecycle

Run created (PENDING)
          ↓
   Start processing
          ↓
     IN_PROGRESS
          ↓
┌─────────────────────────┐
│ For each requirement:   │
│ - Retrieve evidence     │
│ - Evaluate with LLM     │
│ - Save assessment       │
└─────────────────────────┘
          ↓
   All complete?
    /        \
   Yes        No (error)
   ↓            ↓
COMPLETED    FAILED
   ↓            ↓
Capture      Set snapshot
snapshot     status: FAILED
   ↓
Snapshot
captured?
 /    \
Yes    No
 ↓      ↓
CAPTURED FAILED

Flow: replay decision

Replay requested for run
           ↓
   Check snapshotStatus
           ↓
   ┌───────┴───────┐
   ↓               ↓
CAPTURED        Other
   ↓               ↓
Return         FAILED?
snapshot        /    \
data          Yes     No (LEGACY/PENDING)
               ↓          ↓
           Return      Return live
           error       data + warning

Common misconceptions

1. "Running assessment again updates the previous run"

Reality: Each assessment creates a new run. Previous runs are immutable. To see updated results, you must start a new run.

2. "Snapshots include the original documents"

Reality: Snapshots capture results (assessments, tasks, exports) and configuration IDs, not the source documents. Document content is stored separately.

3. "FAILED runs can be resumed"

Reality: A FAILED run cannot be resumed or retried. You must start a new run. The failed run remains in history for debugging.

4. "Replay re-runs the AI"

Reality: Replay retrieves stored results—it does not re-invoke the LLM. This is why snapshots enable deterministic replay without additional AI costs.

5. "All old runs have snapshots"

Reality: Runs created before the snapshot feature was implemented have LEGACY status. They can use live data fallback but results may differ from original.

6. "Snapshot capture failure means the run failed"

Reality: The run itself may have completed successfully (all assessments done), but the snapshot capture step failed. The run's assessments still exist; they're just not frozen in a snapshot.

Evidence, Requirements, and Assessments - What runs produce
Roles and Tenancy - How runs are scoped
Portal User Workflow - Starting runs
Admin Workflow - Monitoring runs
API Reference - Run endpoints

Definitions​

Run​

Run status​

Snapshot​

Snapshot status​

Replay​

Assessment​

Invariants​

How it shows up in the UI​

Portal interface (/cm)​

Admin console (/admin)​

Status indicators​

How it shows up in the API​

Start a run​

Get run details​

Get run assessments​

Replay a run (admin)​

Flow: run lifecycle​

Flow: replay decision​

Common misconceptions​

1. "Running assessment again updates the previous run"​

2. "Snapshots include the original documents"​

3. "FAILED runs can be resumed"​

4. "Replay re-runs the AI"​

5. "All old runs have snapshots"​

6. "Snapshot capture failure means the run failed"​

Related topics​

Definitions

Run

Run status

Snapshot

Snapshot status

Replay

Assessment

Invariants

How it shows up in the UI

Portal interface (`/cm`)

Admin console (`/admin`)

Status indicators

How it shows up in the API

Start a run

Get run details

Get run assessments

Replay a run (admin)

Flow: run lifecycle

Flow: replay decision

Common misconceptions

1. "Running assessment again updates the previous run"

2. "Snapshots include the original documents"

3. "FAILED runs can be resumed"

4. "Replay re-runs the AI"

5. "All old runs have snapshots"

6. "Snapshot capture failure means the run failed"

Related topics