Testing and Debugging
This document covers how to reproduce runs, locate logs and traces, and debug common issues.
Reproducing a Run
Using Run Replay
For runs with snapshotStatus = CAPTURED:
# API endpoint (verified from server/routes.ts)
GET /api/admin/runs/:runId/replay
Returns frozen snapshot data without re-executing.
Manual Reproduction
-
Find the run's pinned versions from
runstable:snapshotCriteriaVersionIdsnapshotCorpusActivationIdsnapshot*VersionIdfields
-
Ensure bindings match (or use test tenant with same bindings)
-
Trigger new run with same evidence documents
Where Logs Live
Application Logs
Console output: All services log to stdout/stderr.
Log patterns (verified from codebase):
// Assessment pipeline (server/lib/assessmentPipeline.ts:96)
console.log(`[Assessment] ${step}:`, JSON.stringify(details).slice(0, 200));
// Tracing (server/lib/tracing.ts:94,100,106,122)
console.warn(`[Tracing] Dropping non-allowlisted attribute: ${key}`);
console.error(`[Tracing] BLOCKED forbidden attribute pattern in key: ${key}`);
console.warn(`[Tracing] Scrubbed forbidden content in value for key: ${key}`);
console.log(`[Tracing] Initializing OTLP exporter to: ${otlpEndpoint}`);
// Tenant resolution (server/routes.ts:169)
console.log(`[TenantResolution] Override: session=${...}, resolved=${...}, role=${...}`);
Trace Storage
Phoenix (Arize): Self-hosted trace server.
Endpoint (server/lib/tracing.ts:120):
const otlpEndpoint = process.env.PHOENIX_OTLP_ENDPOINT || "http://localhost:6006/v1/traces";
Access traces:
- Navigate to Phoenix UI (typically
http://localhost:6006) - Search by
trace_idfrom run record - View span hierarchy and attributes
Run Trace IDs
Every run stores trace identifiers:
| Column | Purpose | Location |
|---|---|---|
traceId | OpenTelemetry trace ID | runs.trace_id |
rootSpanId | Root span for run | runs.root_span_id |
Set at run creation (server/lib/assessmentPipeline.ts:148-162):
const runTrace = startRunSpan({ runId, tenantId, packId, triggerType: "manual" });
await db.update(runs)
.set({ traceId: runTrace.traceId, rootSpanId: runTrace.spanId })
.where(eq(runs.id, run.id));
Audit Logs
Immutable audit trail in database.
Table: audit_logs (defined in shared/schema.ts)
Query via API:
GET /api/admin/audit-logs?tenantId=<id>&runId=<id>
Service function (server/lib/adminService.ts:45-81):
export async function getAuditLogs(filters: AuditLogFilters = {}): Promise<{ logs: AuditLog[]; total: number }>
Trace Attributes
Allowed Attributes
Only allowlisted attributes are exported to traces.
Allowlist (server/lib/tracing.ts:7-47):
const ALLOWED_ATTRIBUTES = new Set([
"tenant.id",
"pack.id",
"run.id",
"run.trigger_type",
"step.name",
"requirement.codes",
"evidence.document_ids",
"chunk.ids",
"chunk.count",
"assessment.ids",
"task.ids",
"export.ids",
"latency.ms",
"llm.tokens.in",
"llm.tokens.out",
"llm.cost.usd",
"llm.model",
"error.type",
"error.message",
// ... more
]);
Forbidden Patterns
Content matching these patterns is scrubbed:
const FORBIDDEN_PATTERNS = [
/prompt/i,
/evidence.*text/i,
/regulatory.*text/i,
/content/i,
/body/i,
/raw/i,
/message/i,
];
Run Steps
Assessment runs track progress through named steps.
Step Names
Definition (shared/schema.ts):
export const RunStepName = {
INGESTION: "INGESTION",
RETRIEVAL: "RETRIEVAL",
VERIFICATION: "VERIFICATION",
ASSESSMENT: "ASSESSMENT",
TASK_GENERATION: "TASK_GENERATION",
EXPORT: "EXPORT"
} as const;
Step Table
Table: run_steps
| Column | Purpose |
|---|---|
runId | Parent run |
stepName | Step identifier |
stepIndex | Order (0-5) |
status | NOT_EXECUTED / IN_PROGRESS / COMPLETED / FAILED |
startedAt | When started |
completedAt | When finished |
spanId | Trace span ID |
inputRefs | Input references (jsonb) |
outputRefs | Output references (jsonb) |
errorDetails | Failure reason |
Querying Steps
// server/lib/adminService.ts:152-156
export async function getRunSteps(runId: string): Promise<RunStep[]> {
return db.select()
.from(runSteps)
.where(eq(runSteps.runId, runId))
.orderBy(runSteps.stepIndex);
}
Debugging Common Issues
Run Stuck IN_PROGRESS
Check for:
- Timeout - requirement took >90s (
REQUIREMENT_TIMEOUT_MSinassessmentPipeline.ts:17) - LLM API failure
- Vector search returning no results
Recovery:
-- Check run status
SELECT id, status, started_at, failed_at, failure_reason
FROM runs WHERE id = 'run-id';
-- Check step statuses
SELECT step_name, status, error_details
FROM run_steps WHERE run_id = 'run-id' ORDER BY step_index;
Assessment Returns MISSING Unexpectedly
Check for:
- Evidence not READY status
- Low similarity scores (check
SIMILARITY_THRESHOLD = 0.20at line 14) - Embedding mismatch
Debug:
-- Check document statuses
SELECT id, original_filename, status FROM documents WHERE tenant_id = 'tenant-id';
-- Check chunk count
SELECT COUNT(*) FROM chunks WHERE tenant_id = 'tenant-id';
-- Check embedding count
SELECT COUNT(*) FROM chunk_embeddings WHERE tenant_id = 'tenant-id';
Replay Returns Different Results
For LEGACY runs:
- Expected - these fall back to live queries
- Bindings may have changed since original run
For CAPTURED runs:
- Should not happen - investigate snapshot integrity
- Check
snapshotHashmatches expected
Trace Not Appearing in Phoenix
Check:
- Phoenix is running on configured endpoint
PHOENIX_OTLP_ENDPOINTenv var is correct- Network connectivity between backend and Phoenix
Verify tracing init (server/lib/tracing.ts:145):
console.log("[Tracing] OpenTelemetry SDK initialized");
Cross-Tenant Access Denied
Expected error when:
- Query param tenant differs from session tenant
SUPER_ADMIN_MODEnot enabled- User lacks
SUPER_ADMINrole
Check (server/routes.ts:118-131):
if (!SUPER_ADMIN_MODE) {
return { error: { status: 403, message: "Cross-tenant access denied: SUPER_ADMIN_MODE not enabled" } };
}
if (sessionRole !== 'SUPER_ADMIN') {
return { error: { status: 403, message: "Cross-tenant access denied: SUPER_ADMIN role required" } };
}
Testing Utilities
Force Failure for Testing
Development only (server/lib/assessmentPipeline.ts:19-22):
// Set to a requirement ID to force a timeout on that requirement
const DEV_FORCE_FAIL_REQUIREMENT_ID: string | null = null;
const DEV_FORCE_FAIL_TIMEOUT_MS = 1; // 1ms timeout for forced failure test
SQL Debugging Queries
-- Recent runs with status
SELECT id, status, snapshot_status, started_at, completed_at
FROM runs
WHERE tenant_id = 'tenant-id'
ORDER BY created_at DESC
LIMIT 10;
-- Assessment breakdown for a run
SELECT status, COUNT(*) as count
FROM assessments
WHERE run_id = 'run-id'
GROUP BY status;
-- Check replay snapshot exists
SELECT id, snapshot_hash, captured_at
FROM run_replay_snapshots
WHERE run_id = 'run-id';
-- Recent audit events
SELECT action, object_type, object_id, created_at
FROM audit_logs
WHERE tenant_id = 'tenant-id'
ORDER BY created_at DESC
LIMIT 20;
Related Pages
- Architecture Overview - System design
- Data Model - Schema reference
- Determinism and Replay - Snapshot pinning
- Repository Map - Where to edit