Runs, Snapshots, and Replay
Definitions
Run
A single execution of the assessment engine against all requirements in a pack. A run:
- Evaluates uploaded evidence against each requirement
- Produces individual assessments with status, confidence, and citations
- Records audit information including timing and model details
- Captures configuration versions for reproducibility
Run status
| Status | Description |
|---|---|
| PENDING | Run created but not yet started |
| IN_PROGRESS | Assessment engine is processing requirements |
| COMPLETED | All requirements assessed successfully |
| FAILED | Run terminated due to error |
Snapshot
A frozen record of a completed run's results and configuration. Snapshots enable deterministic replay by capturing:
- All assessment results
- Generated tasks
- Export data
- Configuration versions used
Snapshot status
| Status | Description |
|---|---|
| PENDING | Run not yet completed—no snapshot exists |
| CAPTURED | Snapshot successfully saved—deterministic replay available |
| FAILED | Snapshot capture failed—replay uses live data (non-deterministic) |
| LEGACY | Pre-snapshot-feature run—no snapshot exists |
Replay
Re-executing or re-viewing a past run's results. Two types:
- Deterministic replay: Uses captured snapshot data—guaranteed identical results
- Live replay: Re-queries current data—results may differ if configuration changed
Assessment
The result of evaluating one requirement during a run. See Evidence, Requirements, and Assessments for details.
Invariants
These conditions must always be true:
-
Only one run IN_PROGRESS per tenant at a time: Concurrent runs are prevented to avoid resource contention.
-
Run results are immutable after completion: Once a run reaches COMPLETED or FAILED status, its data cannot be modified.
-
Snapshots are captured on successful completion: When a run completes successfully, the system captures an immutable snapshot.
-
FAILED snapshot runs cannot use live fallback: If snapshot capture failed, replay returns an error rather than potentially inconsistent data.
-
Version IDs are pinned at run creation: The run records which criteria, corpus, and engine versions were active when it started.
-
Assessments belong to exactly one run: Each assessment record references a single run ID.
How it shows up in the UI
Portal interface (/cm)
- Run button: Starts a new assessment run
- Progress indicator: Shows current run status (PENDING, IN_PROGRESS, COMPLETED, FAILED)
- Results panel: Displays assessment outcomes for the most recent run
- Run selector: (if available) Switch between past runs to view historical results
Admin console (/admin)
Run History section:
- List of all runs with status badges
- Timestamps showing when each run started and completed
- Click a run to see detailed information
Run Details view:
- Timeline: Visual progression of run stages
- Assessment summary: Counts of COMPLETE, PARTIAL, MISSING, FAILED
- Snapshot indicator: Shows if deterministic replay is available
- Replay button: Re-view the run's results
Trace Viewer (/admin/traces/:runId):
- Detailed span-level view of AI operations
- Token usage and latency metrics
- Request/response payloads for debugging
Status indicators
| Icon | Status |
|---|---|
| Gray circle | PENDING |
| Spinning loader | IN_PROGRESS |
| Green checkmark | COMPLETED |
| Red X | FAILED |
| Camera icon | Snapshot CAPTURED |
| Warning triangle | Snapshot FAILED |
How it shows up in the API
Start a run
POST /api/runs/start
Content-Type: application/json
Response:
{
"run": {
"id": "run-abc-123",
"tenantId": "tenant-xyz",
"packId": "project-pack",
"status": "PENDING",
"snapshotStatus": "PENDING",
"createdAt": "2025-01-15T10:00:00Z"
}
}
Get run details
GET /api/runs/:runId
Response:
{
"id": "run-abc-123",
"status": "COMPLETED",
"snapshotStatus": "CAPTURED",
"totalRequirements": 70,
"assessedCount": 70,
"completeCount": 45,
"partialCount": 15,
"missingCount": 10,
"startedAt": "2025-01-15T10:00:05Z",
"completedAt": "2025-01-15T10:05:30Z",
"snapshotCriteriaVersionId": "crit-v2",
"snapshotCorpusActivationId": "corpus-v1",
"snapshotRetrievalVersionId": "retr-v3"
}
Get run assessments
GET /api/runs/:runId/assessments
Response:
{
"assessments": [
{
"id": "asmt-001",
"runId": "run-abc-123",
"requirementId": "req-access-001",
"status": "COMPLETE",
"confidence": 0.94,
"citations": [...],
"reasoning": "..."
}
]
}
Replay a run (admin)
GET /api/admin/runs/:runId/replay
Response (with snapshot):
{
"source": "snapshot",
"data": {
"assessments": [...],
"tasks": [...],
"exports": [...]
}
}
Response (without snapshot, legacy run):
{
"source": "live",
"warning": "LEGACY run - results from live data, may differ from original",
"data": {...}
}
Response (failed snapshot):
{
"error": "REPLAY_UNAVAILABLE",
"message": "Snapshot capture failed for this run. Deterministic replay is not available."
}
Flow: run lifecycle
Run created (PENDING)
↓
Start processing
↓
IN_PROGRESS
↓
┌─────────────────────────┐
│ For each requirement: │
│ - Retrieve evidence │
│ - Evaluate with LLM │
│ - Save assessment │
└─────────────────────────┘
↓
All complete?
/ \
Yes No (error)
↓ ↓
COMPLETED FAILED
↓ ↓
Capture Set snapshot
snapshot status: FAILED
↓
Snapshot
captured?
/ \
Yes No
↓ ↓
CAPTURED FAILED
Flow: replay decision
Replay requested for run
↓
Check snapshotStatus
↓
┌───────┴───────┐
↓ ↓
CAPTURED Other
↓ ↓
Return FAILED?
snapshot / \
data Yes No (LEGACY/PENDING)
↓ ↓
Return Return live
error data + warning
Common misconceptions
1. "Running assessment again updates the previous run"
Reality: Each assessment creates a new run. Previous runs are immutable. To see updated results, you must start a new run.
2. "Snapshots include the original documents"
Reality: Snapshots capture results (assessments, tasks, exports) and configuration IDs, not the source documents. Document content is stored separately.
3. "FAILED runs can be resumed"
Reality: A FAILED run cannot be resumed or retried. You must start a new run. The failed run remains in history for debugging.
4. "Replay re-runs the AI"
Reality: Replay retrieves stored results—it does not re-invoke the LLM. This is why snapshots enable deterministic replay without additional AI costs.
5. "All old runs have snapshots"
Reality: Runs created before the snapshot feature was implemented have LEGACY status. They can use live data fallback but results may differ from original.
6. "Snapshot capture failure means the run failed"
Reality: The run itself may have completed successfully (all assessments done), but the snapshot capture step failed. The run's assessments still exist; they're just not frozen in a snapshot.
Related topics
- Evidence, Requirements, and Assessments - What runs produce
- Roles and Tenancy - How runs are scoped
- Portal User Workflow - Starting runs
- Admin Workflow - Monitoring runs
- API Reference - Run endpoints