Skip to main content

Testing and Debugging

This document covers how to reproduce runs, locate logs and traces, and debug common issues.

Reproducing a Run

Using Run Replay

For runs with snapshotStatus = CAPTURED:

# API endpoint (verified from server/routes.ts)
GET /api/admin/runs/:runId/replay

Returns frozen snapshot data without re-executing.

Manual Reproduction

  1. Find the run's pinned versions from runs table:

    • snapshotCriteriaVersionId
    • snapshotCorpusActivationId
    • snapshot*VersionId fields
  2. Ensure bindings match (or use test tenant with same bindings)

  3. Trigger new run with same evidence documents

Where Logs Live

Application Logs

Console output: All services log to stdout/stderr.

Log patterns (verified from codebase):

// Assessment pipeline (server/lib/assessmentPipeline.ts:96)
console.log(`[Assessment] ${step}:`, JSON.stringify(details).slice(0, 200));

// Tracing (server/lib/tracing.ts:94,100,106,122)
console.warn(`[Tracing] Dropping non-allowlisted attribute: ${key}`);
console.error(`[Tracing] BLOCKED forbidden attribute pattern in key: ${key}`);
console.warn(`[Tracing] Scrubbed forbidden content in value for key: ${key}`);
console.log(`[Tracing] Initializing OTLP exporter to: ${otlpEndpoint}`);

// Tenant resolution (server/routes.ts:169)
console.log(`[TenantResolution] Override: session=${...}, resolved=${...}, role=${...}`);

Trace Storage

Phoenix (Arize): Self-hosted trace server.

Endpoint (server/lib/tracing.ts:120):

const otlpEndpoint = process.env.PHOENIX_OTLP_ENDPOINT || "http://localhost:6006/v1/traces";

Access traces:

  1. Navigate to Phoenix UI (typically http://localhost:6006)
  2. Search by trace_id from run record
  3. View span hierarchy and attributes

Run Trace IDs

Every run stores trace identifiers:

ColumnPurposeLocation
traceIdOpenTelemetry trace IDruns.trace_id
rootSpanIdRoot span for runruns.root_span_id

Set at run creation (server/lib/assessmentPipeline.ts:148-162):

const runTrace = startRunSpan({ runId, tenantId, packId, triggerType: "manual" });

await db.update(runs)
.set({ traceId: runTrace.traceId, rootSpanId: runTrace.spanId })
.where(eq(runs.id, run.id));

Audit Logs

Immutable audit trail in database.

Table: audit_logs (defined in shared/schema.ts)

Query via API:

GET /api/admin/audit-logs?tenantId=<id>&runId=<id>

Service function (server/lib/adminService.ts:45-81):

export async function getAuditLogs(filters: AuditLogFilters = {}): Promise<{ logs: AuditLog[]; total: number }>

Trace Attributes

Allowed Attributes

Only allowlisted attributes are exported to traces.

Allowlist (server/lib/tracing.ts:7-47):

const ALLOWED_ATTRIBUTES = new Set([
"tenant.id",
"pack.id",
"run.id",
"run.trigger_type",
"step.name",
"requirement.codes",
"evidence.document_ids",
"chunk.ids",
"chunk.count",
"assessment.ids",
"task.ids",
"export.ids",
"latency.ms",
"llm.tokens.in",
"llm.tokens.out",
"llm.cost.usd",
"llm.model",
"error.type",
"error.message",
// ... more
]);

Forbidden Patterns

Content matching these patterns is scrubbed:

const FORBIDDEN_PATTERNS = [
/prompt/i,
/evidence.*text/i,
/regulatory.*text/i,
/content/i,
/body/i,
/raw/i,
/message/i,
];

Run Steps

Assessment runs track progress through named steps.

Step Names

Definition (shared/schema.ts):

export const RunStepName = {
INGESTION: "INGESTION",
RETRIEVAL: "RETRIEVAL",
VERIFICATION: "VERIFICATION",
ASSESSMENT: "ASSESSMENT",
TASK_GENERATION: "TASK_GENERATION",
EXPORT: "EXPORT"
} as const;

Step Table

Table: run_steps

ColumnPurpose
runIdParent run
stepNameStep identifier
stepIndexOrder (0-5)
statusNOT_EXECUTED / IN_PROGRESS / COMPLETED / FAILED
startedAtWhen started
completedAtWhen finished
spanIdTrace span ID
inputRefsInput references (jsonb)
outputRefsOutput references (jsonb)
errorDetailsFailure reason

Querying Steps

// server/lib/adminService.ts:152-156
export async function getRunSteps(runId: string): Promise<RunStep[]> {
return db.select()
.from(runSteps)
.where(eq(runSteps.runId, runId))
.orderBy(runSteps.stepIndex);
}

Debugging Common Issues

Run Stuck IN_PROGRESS

Check for:

  1. Timeout - requirement took >90s (REQUIREMENT_TIMEOUT_MS in assessmentPipeline.ts:17)
  2. LLM API failure
  3. Vector search returning no results

Recovery:

-- Check run status
SELECT id, status, started_at, failed_at, failure_reason
FROM runs WHERE id = 'run-id';

-- Check step statuses
SELECT step_name, status, error_details
FROM run_steps WHERE run_id = 'run-id' ORDER BY step_index;

Assessment Returns MISSING Unexpectedly

Check for:

  1. Evidence not READY status
  2. Low similarity scores (check SIMILARITY_THRESHOLD = 0.20 at line 14)
  3. Embedding mismatch

Debug:

-- Check document statuses
SELECT id, original_filename, status FROM documents WHERE tenant_id = 'tenant-id';

-- Check chunk count
SELECT COUNT(*) FROM chunks WHERE tenant_id = 'tenant-id';

-- Check embedding count
SELECT COUNT(*) FROM chunk_embeddings WHERE tenant_id = 'tenant-id';

Replay Returns Different Results

For LEGACY runs:

  • Expected - these fall back to live queries
  • Bindings may have changed since original run

For CAPTURED runs:

  • Should not happen - investigate snapshot integrity
  • Check snapshotHash matches expected

Trace Not Appearing in Phoenix

Check:

  1. Phoenix is running on configured endpoint
  2. PHOENIX_OTLP_ENDPOINT env var is correct
  3. Network connectivity between backend and Phoenix

Verify tracing init (server/lib/tracing.ts:145):

console.log("[Tracing] OpenTelemetry SDK initialized");

Cross-Tenant Access Denied

Expected error when:

  • Query param tenant differs from session tenant
  • SUPER_ADMIN_MODE not enabled
  • User lacks SUPER_ADMIN role

Check (server/routes.ts:118-131):

if (!SUPER_ADMIN_MODE) {
return { error: { status: 403, message: "Cross-tenant access denied: SUPER_ADMIN_MODE not enabled" } };
}
if (sessionRole !== 'SUPER_ADMIN') {
return { error: { status: 403, message: "Cross-tenant access denied: SUPER_ADMIN role required" } };
}

Testing Utilities

Force Failure for Testing

Development only (server/lib/assessmentPipeline.ts:19-22):

// Set to a requirement ID to force a timeout on that requirement
const DEV_FORCE_FAIL_REQUIREMENT_ID: string | null = null;
const DEV_FORCE_FAIL_TIMEOUT_MS = 1; // 1ms timeout for forced failure test

SQL Debugging Queries

-- Recent runs with status
SELECT id, status, snapshot_status, started_at, completed_at
FROM runs
WHERE tenant_id = 'tenant-id'
ORDER BY created_at DESC
LIMIT 10;

-- Assessment breakdown for a run
SELECT status, COUNT(*) as count
FROM assessments
WHERE run_id = 'run-id'
GROUP BY status;

-- Check replay snapshot exists
SELECT id, snapshot_hash, captured_at
FROM run_replay_snapshots
WHERE run_id = 'run-id';

-- Recent audit events
SELECT action, object_type, object_id, created_at
FROM audit_logs
WHERE tenant_id = 'tenant-id'
ORDER BY created_at DESC
LIMIT 20;