Skip to main content

Evidence, Requirements, and Assessments

Definitions

Evidence (Document)

A file uploaded to demonstrate compliance. Evidence is processed through a pipeline that extracts text, chunks it for indexing, and creates vector embeddings for semantic search.

Attributes:

  • originalFilename: Name of the uploaded file
  • fileType: Format (pdf, docx, xlsx, etc.)
  • status: Processing stage (UPLOADED → READY or FAILED)
  • sha256Hash: Content fingerprint for deduplication
  • parentDocumentId: Reference to parent if extracted from ZIP

Evidence status lifecycle

StatusDescription
UPLOADEDFile received and stored
EXTRACTEDText content extracted from file format
PARSEDContent split into chunks
INDEXEDVector embeddings created
READYAvailable for assessment
FAILEDProcessing failed at some stage

(Verified: shared/schema.ts:160-167 — EvidenceStatus enum)

Two status enums

Portal evidence uploads use EvidenceStatus (above). Admin corpus ingestion uses a separate CorpusIngestionStatus enum (shared/schema.ts:1092-1098) with values: PENDING → FETCHING → PARSING → INDEXING → COMPLETED (or FAILED). See Roles and Tenancy for the distinction between Portal and Admin workflows.

Chunk

A segment of extracted text from a document. Chunks are the unit of indexing and retrieval.

Attributes:

  • content: The text content
  • chunkIndex: Position within the document
  • pageNumber: Source page (for PDFs)
  • tokenCount: Approximate token count
  • startOffset / endOffset: Character positions in source

Requirement (Requirement Item)

An individual checklist item that must be satisfied. Requirements are grouped under criteria.

Attributes:

  • requirementId: Unique identifier within the pack
  • criterionId: Parent criterion number
  • criterionTitle: Parent criterion name
  • title: The requirement statement
  • subcheckLabel: Sub-item label (i, ii, iii, etc.) if applicable

Criterion

A category or grouping of related requirements.

Attributes:

  • criterionId: Numeric identifier
  • criterionTitle: Category name
  • criterionDescription: Detailed description

Assessment

The result of evaluating one requirement against uploaded evidence during a run.

Attributes:

  • runId: The assessment run this belongs to
  • requirementId: Which requirement was evaluated
  • status: Evaluation outcome (COMPLETE, PARTIAL, MISSING, FAILED)
  • confidence: AI certainty score (0.0 to 1.0)
  • citations: References to supporting evidence
  • reasoning: AI explanation of the assessment

Assessment status

StatusMeaning
COMPLETEEvidence fully satisfies the requirement
PARTIALEvidence partially addresses the requirement
MISSINGNo relevant evidence found
FAILEDAssessment could not be completed (error)

Task

A generated remediation item for requirements not fully satisfied.

Attributes:

  • requirementId: Which requirement this addresses
  • title: Brief task description
  • description: Detailed remediation guidance
  • status: Task lifecycle (PENDING, IN_PROGRESS, COMPLETED)

Invariants

These conditions must always be true:

  1. Documents belong to exactly one tenant: All evidence is tenant-scoped.

  2. Chunks belong to exactly one document: Each chunk references its source document.

  3. Requirements belong to exactly one pack: Requirements are defined per pack.

  4. Assessments belong to exactly one run: Each assessment is part of a specific run.

  5. Only READY documents are assessed: Documents in earlier processing stages are excluded from assessment runs.

  6. SHA256 hash is unique per tenant: Duplicate detection prevents reprocessing identical files.

  7. Parent-child relationships cascade: Deleting a parent document (ZIP) deletes all extracted children.

  8. Assessment status derives from confidence thresholds: The same confidence score may result in different statuses depending on configuration.

How it shows up in the UI

Portal interface (/cm)

Requirements panel (left):

  • Hierarchical list: Criteria → Requirements
  • Color-coded status indicators after assessment
  • Click to view assessment details

Documents panel (center):

  • List of uploaded files with status badges
  • Upload button for adding new evidence
  • Processing progress indicators
  • Click to preview document content

Assessment details (right):

  • Status badge (COMPLETE/PARTIAL/MISSING)
  • Confidence score display
  • Clickable citations linking to evidence
  • AI reasoning text
  • Generated task (if applicable)

Admin console (/admin)

Run Details view:

  • Assessment summary counts
  • Expandable list of all assessments
  • Filter by status

Trace Viewer:

  • Evidence retrieval operations
  • LLM evaluation calls
  • Citation extraction details

How it shows up in the API

Upload document

POST /api/documents/upload
Content-Type: multipart/form-data

file: [binary data]

Response:

{
"document": {
"id": "doc-abc-123",
"tenantId": "tenant-xyz",
"originalFilename": "Access-Control-Policy.pdf",
"fileType": "pdf",
"status": "UPLOADED",
"sha256Hash": "a1b2c3...",
"uploadedAt": "2025-01-15T10:00:00Z"
}
}

Get document status

GET /api/documents/:id/status

Response:

{
"id": "doc-abc-123",
"status": "READY",
"uploadedAt": "2025-01-15T10:00:00Z",
"extractedAt": "2025-01-15T10:00:15Z",
"parsedAt": "2025-01-15T10:00:30Z",
"indexedAt": "2025-01-15T10:00:45Z",
"readyAt": "2025-01-15T10:00:50Z"
}

Get requirements

GET /api/requirements

Response:

{
"requirements": [
{
"id": "req-001",
"requirementId": "1.1.a",
"criterionId": 1,
"criterionTitle": "Access Management",
"title": "Documented access control policy",
"subcheckLabel": "a"
}
]
}

Get assessments for a run

GET /api/runs/:runId/assessments

Response:

{
"assessments": [
{
"id": "asmt-001",
"requirementId": "1.1.a",
"status": "COMPLETE",
"confidence": 0.94,
"citations": [
{
"documentId": "doc-abc-123",
"chunkId": "chunk-456",
"text": "Access control procedures are documented in section 3...",
"pageNumber": 3
}
],
"reasoning": "The uploaded Access Control Policy document explicitly describes access control procedures in section 3, covering user provisioning, access reviews, and privileged access management."
}
]
}

Flow: evidence to assessment

Document uploaded

UPLOADED status

┌─────────────────┐
│ Text extraction │
│ (pdf-parse, │
│ mammoth, etc.) │
└─────────────────┘

EXTRACTED status

┌─────────────────┐
│ Chunking │
│ (split by size/ │
│ semantics) │
└─────────────────┘

PARSED status

┌─────────────────┐
│ Embedding │
│ (vector store) │
└─────────────────┘

INDEXED status

READY status

════════════════
Assessment run
════════════════

┌─────────────────┐
│ For each │
│ requirement: │
└─────────────────┘

┌─────────────────┐
│ Vector search │
│ (find relevant │
│ chunks) │
└─────────────────┘

┌─────────────────┐
│ LLM evaluation │
│ (assess status, │
│ confidence) │
└─────────────────┘

Assessment saved
(status, citations,
reasoning)

Common misconceptions

1. "Uploading a document automatically runs assessment"

Reality: Document upload and assessment are separate operations. Upload processes the document to READY status. You must explicitly start an assessment run.

2. "Deleting a document removes its assessments"

Reality: Assessments are immutable records of past evaluations. Deleting a document removes it from future runs but does not alter historical assessment records.

3. "Higher confidence always means COMPLETE status"

Reality: Status thresholds are configurable. A 0.75 confidence might be COMPLETE in one configuration but PARTIAL in another with a higher threshold.

4. "Citations include the entire document"

Reality: Citations reference specific chunks—portions of the document that matched the requirement. Large documents may have many chunks, only some of which are cited.

5. "FAILED assessment means the requirement isn't met"

Reality: FAILED means the AI could not evaluate the requirement (system error). It's distinct from MISSING, which means no evidence was found.

6. "Requirements are the same across tenants"

Reality: While packs define standard requirements, tenants can have different pack bindings with customized criteria versions.

7. "All document types work equally well"

Reality: PDFs with selectable text and DOCX files typically yield better results than scanned images or complex spreadsheets. Evidence quality affects assessment accuracy.