Evidence Submission and Mapping

What it is

Evidence submission is the process of uploading documents that demonstrate compliance with requirements. Evidence mapping is how the platform associates uploaded documents with specific requirements through AI-powered semantic analysis.

This guide covers best practices for preparing evidence, understanding the upload process, and maximizing the accuracy of automatic mapping.

When to use

Use this guide when:

Preparing documents for upload
Wanting to improve assessment accuracy
Understanding why certain requirements show as MISSING
Optimizing evidence organization

Do not use when:

Reviewing assessment results (see Reviewing Results)
Running the platform as an administrator (see Admin Workflow)

Prerequisites

Before starting, ensure you have:

Access to the portal interface
Evidence documents in supported formats
Understanding of the requirements you're addressing

Step-by-step

Step 1: Understand supported formats

The platform accepts these file types:

Format	Extension	Notes
PDF	.pdf	Best for formatted documents. Ensure text is selectable (not scanned images).
Word	.docx	Converted to text with formatting preserved.
Excel	.xlsx	Each sheet processed separately. Good for checklists and matrices.
CSV	.csv	Simple tabular data.
Text	.txt	Plain text files.
Images	.png, .jpg, .jpeg	Limited text extraction. Best for diagrams with OCR.
Archives	.zip	Contents extracted and processed individually.

Step 2: Prepare your documents

For best results:

Use descriptive filenames: Access-Control-Policy-v2.1.pdf is better than doc1.pdf
Ensure text is extractable: For PDFs, verify you can select and copy text
Remove password protection: Protected files cannot be processed
Keep files under size limit: Default upload limit is 100 MB (deployment-configured). (✅ Verified: server/routes.ts:88)
Use native formats: Convert scanned documents to text via OCR before upload

Step 3: Organize by topic

Structure your evidence strategically:

/Evidence Bundle/
├── Access-Management/
│   ├── Access-Control-Policy.pdf
│   ├── Privileged-Access-Procedure.docx
│   └── Access-Review-Records.xlsx
├── Security-Operations/
│   ├── Incident-Response-Plan.pdf
│   └── Security-Monitoring-Procedure.docx
└── Governance/
    ├── Information-Security-Policy.pdf
    └── Risk-Assessment-Report.pdf

You can upload the entire folder as a ZIP file, and the platform preserves the structure.

Step 4: Upload documents

Navigate to the Portal (served at /cm in this repo)
Click Upload or drag files to the document panel
Monitor processing status for each file
Wait for all documents to reach READY status

Step 5: Understand the processing pipeline

Each document goes through these stages:

Stage	What happens
UPLOADED	File received and stored securely
EXTRACTED	Text content pulled from file format
PARSED	Content split into chunks (🧩 Template: chunk size is deployment-configured)
INDEXED	Vector embeddings created for semantic search
READY	Available for assessment runs

Step 6: Verify successful processing

Check each document:

Status shows READY (green indicator)
Click the document to see extracted content preview
Verify text was extracted correctly
Note any extraction issues for re-upload

Step 7: Start an assessment

Once documents are READY:

Click Run Assessment
The AI evaluates each requirement against all indexed evidence
For each requirement, the AI:
- Searches for semantically relevant chunks
- Evaluates if the evidence satisfies the requirement
- Assigns a status (COMPLETE, PARTIAL, MISSING)
- Provides citations and reasoning

Example

Scenario: Uploading a ZIP bundle containing access management evidence.

Input:

AccessManagement-Bundle.zip/
├── Access-Control-Policy.pdf (15 pages)
├── Privileged-Access-Procedure.docx (8 pages)
└── Quarterly-Access-Review.xlsx (3 sheets)

Process:

Upload ZIP → Three child documents created
Each document processes independently → All reach READY (🧩 Template: timing varies by deployment)
Run assessment → 10 access-related requirements evaluated

Result:

Access-Control-Policy.pdf cited for 6 requirements
Privileged-Access-Procedure.docx cited for 3 requirements
Quarterly-Access-Review.xlsx cited for 2 requirements (overlapping)
1 requirement still MISSING (requires password policy documentation)

Troubleshooting

Document stuck in EXTRACTED status

Symptom: Processing stops after text extraction.

Causes and fixes:

Very large document: Split into smaller files
Unusual encoding: Re-export as UTF-8
System timeout: Wait and check if it eventually completes

No text extracted from PDF

Symptom: Document processes but shows empty content.

Causes and fixes:

Scanned image without OCR: Use Adobe Acrobat or similar to add text layer
PDF is image-only: Convert to text first using OCR software
Corrupted file: Re-export the PDF

Excel sheets not indexed

Symptom: Spreadsheet reaches READY but content not found in searches.

Causes and fixes:

Empty cells dominate: Ensure sheets have meaningful text content
Data is numeric only: Add text descriptions or headers
Hidden sheets: Unhide all sheets before upload

ZIP extraction fails

Symptom: ZIP upload fails or creates no child documents.

Causes and fixes:

Archive corrupted: Re-create the ZIP file
Nested too deeply: Maximum 5 levels of nesting
Too many files: Maximum 1000 files per ZIP
Size limit: Maximum 500MB total extracted size

Evidence not found during assessment

Symptom: AI marks requirements as MISSING despite relevant evidence.

Causes and fixes:

Terminology mismatch: Evidence uses different words than requirements
Implicit compliance: Evidence implies rather than explicitly states compliance
Low similarity: Content is too tangential to match

Duplicate document warning

Symptom: Upload shows "duplicate detected" message.

Causes and fixes:

Exact duplicate: File with same SHA256 hash already exists—this is expected
Near-duplicate: Different file with same content—platform processes it as new
Intentional re-upload: No action needed—original remains indexed

Document shows FAILED status

Symptom: Document processing failed with error.

Causes and fixes:

Unsupported format: Convert to a supported format
File corrupted: Re-export or re-download the original
Password protected: Remove protection before upload

Large document processing timeout

Symptom: Processing never completes for very large files.

Causes and fixes:

Split large documents into sections
Remove unnecessary pages (cover pages, blanks)
Use a more efficient format (DOCX instead of large PDF)

Gotchas and edge cases

Duplicate detection uses SHA256: Two files with identical content share the same hash and won't be re-indexed. Modify the file (even adding a space) to force reprocessing.
Parent-child relationships: Files extracted from ZIPs have a parent. Deleting the parent ZIP deletes all extracted children.
Image files have limited value: PNG/JPG files yield minimal text unless they contain text that OCR can extract.
Assessment is point-in-time: Documents uploaded after a run completes are not included in that run's results. Start a new run to include new evidence.
Chunk size affects matching: Very short documents may become a single chunk, while long documents create many. This affects how precisely the AI can cite evidence.
Filename is metadata: The original filename is preserved and displayed, but content is what matters for matching.
XLSX formulas are not evaluated: Only text values are extracted; formula results may not appear if cells show formulas.

Portal User Workflow - Complete end-to-end process
Reviewing Results and Responses - Understanding assessment output
Evidence, Requirements, and Assessments - Data model details
Quickstart - Get started quickly
API Reference - Programmatic upload endpoints
Runs, Snapshots, and Replay - How runs process evidence

What it is​

When to use​

Prerequisites​

Step-by-step​

Step 1: Understand supported formats​

Step 2: Prepare your documents​

Step 3: Organize by topic​

Step 4: Upload documents​

Step 5: Understand the processing pipeline​

Step 6: Verify successful processing​

Step 7: Start an assessment​

Example​

Troubleshooting​

Document stuck in EXTRACTED status​

No text extracted from PDF​

Excel sheets not indexed​

ZIP extraction fails​

Evidence not found during assessment​

Duplicate document warning​

Document shows FAILED status​

Large document processing timeout​

Gotchas and edge cases​

Related links​

What it is

When to use

Prerequisites

Step-by-step

Step 1: Understand supported formats

Step 2: Prepare your documents

Step 3: Organize by topic

Step 4: Upload documents

Step 5: Understand the processing pipeline

Step 6: Verify successful processing

Step 7: Start an assessment

Example

Troubleshooting

Document stuck in EXTRACTED status

No text extracted from PDF

Excel sheets not indexed

ZIP extraction fails

Evidence not found during assessment

Duplicate document warning

Document shows FAILED status

Large document processing timeout

Gotchas and edge cases

Related links