Testing and Debugging

This document covers how to reproduce runs, locate logs and traces, and debug common issues.

Reproducing a Run

Using Run Replay

For runs with snapshotStatus = CAPTURED:

# API endpoint (verified from server/routes.ts)
GET /api/admin/runs/:runId/replay

Returns frozen snapshot data without re-executing.

Manual Reproduction

Find the run's pinned versions from runs table:
- snapshotCriteriaVersionId
- snapshotCorpusActivationId
- snapshot*VersionId fields
Ensure bindings match (or use test tenant with same bindings)
Trigger new run with same evidence documents

Where Logs Live

Application Logs

Console output: All services log to stdout/stderr.

Log patterns (verified from codebase):

// Assessment pipeline (server/lib/assessmentPipeline.ts:96)
console.log(`[Assessment] ${step}:`, JSON.stringify(details).slice(0, 200));

// Tracing (server/lib/tracing.ts:94,100,106,122)
console.warn(`[Tracing] Dropping non-allowlisted attribute: ${key}`);
console.error(`[Tracing] BLOCKED forbidden attribute pattern in key: ${key}`);
console.warn(`[Tracing] Scrubbed forbidden content in value for key: ${key}`);
console.log(`[Tracing] Initializing OTLP exporter to: ${otlpEndpoint}`);

// Tenant resolution (server/routes.ts:169)
console.log(`[TenantResolution] Override: session=${...}, resolved=${...}, role=${...}`);

Trace Storage

Phoenix (Arize): Self-hosted trace server.

Endpoint (server/lib/tracing.ts:120):

const otlpEndpoint = process.env.PHOENIX_OTLP_ENDPOINT || "http://localhost:6006/v1/traces";

Access traces:

Navigate to Phoenix UI (typically http://localhost:6006)
Search by trace_id from run record
View span hierarchy and attributes

Run Trace IDs

Every run stores trace identifiers:

Column	Purpose	Location
`traceId`	OpenTelemetry trace ID	`runs.trace_id`
`rootSpanId`	Root span for run	`runs.root_span_id`

Set at run creation (server/lib/assessmentPipeline.ts:148-162):

const runTrace = startRunSpan({ runId, tenantId, packId, triggerType: "manual" });

await db.update(runs)
  .set({ traceId: runTrace.traceId, rootSpanId: runTrace.spanId })
  .where(eq(runs.id, run.id));

Audit Logs

Immutable audit trail in database.

Table: audit_logs (defined in shared/schema.ts)

Query via API:

GET /api/admin/audit-logs?tenantId=<id>&runId=<id>

Service function (server/lib/adminService.ts:45-81):

export async function getAuditLogs(filters: AuditLogFilters = {}): Promise<{ logs: AuditLog[]; total: number }>

Trace Attributes

Allowed Attributes

Only allowlisted attributes are exported to traces.

Allowlist (server/lib/tracing.ts:7-47):

const ALLOWED_ATTRIBUTES = new Set([
  "tenant.id",
  "pack.id",
  "run.id",
  "run.trigger_type",
  "step.name",
  "requirement.codes",
  "evidence.document_ids",
  "chunk.ids",
  "chunk.count",
  "assessment.ids",
  "task.ids",
  "export.ids",
  "latency.ms",
  "llm.tokens.in",
  "llm.tokens.out",
  "llm.cost.usd",
  "llm.model",
  "error.type",
  "error.message",
  // ... more
]);

Forbidden Patterns

Content matching these patterns is scrubbed:

const FORBIDDEN_PATTERNS = [
  /prompt/i,
  /evidence.*text/i,
  /regulatory.*text/i,
  /content/i,
  /body/i,
  /raw/i,
  /message/i,
];

Run Steps

Assessment runs track progress through named steps.

Step Names

Definition (shared/schema.ts):

export const RunStepName = {
  INGESTION: "INGESTION",
  RETRIEVAL: "RETRIEVAL",
  VERIFICATION: "VERIFICATION",
  ASSESSMENT: "ASSESSMENT",
  TASK_GENERATION: "TASK_GENERATION",
  EXPORT: "EXPORT"
} as const;

Step Table

Table: run_steps

Column	Purpose
`runId`	Parent run
`stepName`	Step identifier
`stepIndex`	Order (0-5)
`status`	NOT_EXECUTED / IN_PROGRESS / COMPLETED / FAILED
`startedAt`	When started
`completedAt`	When finished
`spanId`	Trace span ID
`inputRefs`	Input references (jsonb)
`outputRefs`	Output references (jsonb)
`errorDetails`	Failure reason

Querying Steps

// server/lib/adminService.ts:152-156
export async function getRunSteps(runId: string): Promise<RunStep[]> {
  return db.select()
    .from(runSteps)
    .where(eq(runSteps.runId, runId))
    .orderBy(runSteps.stepIndex);
}

Debugging Common Issues

Run Stuck IN_PROGRESS

Check for:

Timeout - requirement took >90s (REQUIREMENT_TIMEOUT_MS in assessmentPipeline.ts:17)
LLM API failure
Vector search returning no results

Recovery:

-- Check run status
SELECT id, status, started_at, failed_at, failure_reason 
FROM runs WHERE id = 'run-id';

-- Check step statuses
SELECT step_name, status, error_details 
FROM run_steps WHERE run_id = 'run-id' ORDER BY step_index;

Assessment Returns MISSING Unexpectedly

Check for:

Evidence not READY status
Low similarity scores (check SIMILARITY_THRESHOLD = 0.20 at line 14)
Embedding mismatch

Debug:

-- Check document statuses
SELECT id, original_filename, status FROM documents WHERE tenant_id = 'tenant-id';

-- Check chunk count
SELECT COUNT(*) FROM chunks WHERE tenant_id = 'tenant-id';

-- Check embedding count
SELECT COUNT(*) FROM chunk_embeddings WHERE tenant_id = 'tenant-id';

Replay Returns Different Results

For LEGACY runs:

Expected - these fall back to live queries
Bindings may have changed since original run

For CAPTURED runs:

Should not happen - investigate snapshot integrity
Check snapshotHash matches expected

Trace Not Appearing in Phoenix

Check:

Phoenix is running on configured endpoint
PHOENIX_OTLP_ENDPOINT env var is correct
Network connectivity between backend and Phoenix

Verify tracing init (server/lib/tracing.ts:145):

console.log("[Tracing] OpenTelemetry SDK initialized");

Cross-Tenant Access Denied

Expected error when:

Query param tenant differs from session tenant
SUPER_ADMIN_MODE not enabled
User lacks SUPER_ADMIN role

Check (server/routes.ts:118-131):

if (!SUPER_ADMIN_MODE) {
  return { error: { status: 403, message: "Cross-tenant access denied: SUPER_ADMIN_MODE not enabled" } };
}
if (sessionRole !== 'SUPER_ADMIN') {
  return { error: { status: 403, message: "Cross-tenant access denied: SUPER_ADMIN role required" } };
}

Testing Utilities

Force Failure for Testing

Development only (server/lib/assessmentPipeline.ts:19-22):

// Set to a requirement ID to force a timeout on that requirement
const DEV_FORCE_FAIL_REQUIREMENT_ID: string | null = null;
const DEV_FORCE_FAIL_TIMEOUT_MS = 1; // 1ms timeout for forced failure test

SQL Debugging Queries

-- Recent runs with status
SELECT id, status, snapshot_status, started_at, completed_at 
FROM runs 
WHERE tenant_id = 'tenant-id' 
ORDER BY created_at DESC 
LIMIT 10;

-- Assessment breakdown for a run
SELECT status, COUNT(*) as count 
FROM assessments 
WHERE run_id = 'run-id' 
GROUP BY status;

-- Check replay snapshot exists
SELECT id, snapshot_hash, captured_at 
FROM run_replay_snapshots 
WHERE run_id = 'run-id';

-- Recent audit events
SELECT action, object_type, object_id, created_at 
FROM audit_logs 
WHERE tenant_id = 'tenant-id' 
ORDER BY created_at DESC 
LIMIT 20;

Architecture Overview - System design
Data Model - Schema reference
Determinism and Replay - Snapshot pinning
Repository Map - Where to edit

Reproducing a Run​

Using Run Replay​

Manual Reproduction​

Where Logs Live​

Application Logs​

Trace Storage​

Run Trace IDs​

Audit Logs​

Trace Attributes​

Allowed Attributes​

Forbidden Patterns​

Run Steps​

Step Names​

Step Table​

Querying Steps​

Debugging Common Issues​

Run Stuck IN_PROGRESS​

Assessment Returns MISSING Unexpectedly​

Replay Returns Different Results​

Trace Not Appearing in Phoenix​

Cross-Tenant Access Denied​

Testing Utilities​

Force Failure for Testing​

SQL Debugging Queries​

Related Pages​