For Enterprise & Procurement Teams · v1.3 · June 2026
Evidence Pack
A summary of the system behavior guarantees, a worked failure trace example, and the audit readiness model for the NHID-Clinical reference implementation — for teams evaluating adoption.
1. System Behavior Guarantees
Identical input event + identical policy version → identical trace output. The policy engine is a pure function with no side effects. Output does not depend on wall-clock time, process state, or external calls.
Any stored trace can be replayed. The policy version is embedded in each event header. Replay with a different policy version is detected and flagged as a mismatch.
The engine never raises an unhandled exception. Invalid or malformed input returns a deterministic error trace (action: LOG_ONLY, recoverable: true). The caller always gets a response.
Submitting the same request_id twice produces the same policy decision. The event store deduplicates on request_id at the PERSIST stage.
2. Anonymized Failure Trace Example
The example below is synthetic — constructed from observed behavior patterns, with all identifying information removed. It shows what an IDG-01 (late disclosure) violation looks like in the NHID-Clinical audit trace, and what a payer auditor would see when reviewing it.
Anonymized Failure Trace — IDG-01 Violation
Source: Synthetic example based on observed behavior patterns. No real PHI, no real provider data.
Generated: 2026-06-07 | Policy version: nhid-clinical-v1.3 | Correlation ID: [REDACTED]
t=00:00.000 INGEST POST /voice/process received
session_id: [REDACTED]
call_sid: [REDACTED]
caller_type: ai_agent
t=00:00.084 VALIDATE SpeechResult normalized
turn_count: 0
content_hash: [REDACTED]
t=00:00.091 STATE Session reconstructed
turn_count: 0
disclosure_timestamp: null
disclosure_confirmed: false
t=00:00.098 POLICY IDG-01 evaluated
rule: "Disclose AI identity before any data exchange"
turn_count: 0
disclosure_confirmed: false
trigger: FIRST_TURN_NO_DISCLOSURE
action: DISCLOSE_IDENTITY
── Violation recorded ──────────────────────────────────────────────
t=00:00.103 VIOLATION IDG-01
severity: critical
message: "AI identity not disclosed at call start"
action_taken: DISCLOSE_IDENTITY (forced)
data_exchanged_before_disclosure: false
recoverable: true
────────────────────────────────────────────────────────────────────
t=00:00.109 EXEC TwiML rendered — forced disclosure statement
text: "This call is being handled by an automated system on behalf of [Provider Name Redacted]."
disclosure_forced: true
t=00:00.114 PERSIST Event written
disclosure_timestamp: 00:00.109
boundary_violations: ["IDG-01"]
partial_failure: true
deterministic_hash: [REDACTED]
── What this means ─────────────────────────────────────────────────
The AI agent did not disclose its automated nature at call start.
The policy engine detected a turn_count=0 exchange with no prior
disclosure and forced a disclosure statement before any data could
be shared. The violation is logged as critical but recoverable.
A payer auditing this session would see:
- disclosure_timestamp set 109ms into the call (forced, not voluntary)
- partial_failure: true
- boundary_violations: ["IDG-01"]
────────────────────────────────────────────────────────────────────
3. Failure & Attack Simulation Coverage
The failure injection harness covers the following scenarios:
| Scenario | Expected behavior |
|---|---|
| Empty SpeechResult | Policy evaluated, event written, no 500 |
| Null bytes in input | Sanitized before engine, sanitized text stored |
| Missing CallSid (session binding failure) | 400 returned, no event written, structured error body |
| Late disclosure (IDG-01 + PDX-01) | DENY_DATA action, 2 critical violations logged |
| Escalation path unavailable (EIT-01) | ESCALATE_HUMAN with TwiML fallback, violation logged |
| Deceptive artifact (DBC-01) | LOG_ONLY, partial_failure=true, session continues |
| Missing audit fields (ATR-01) | Violation logged, pipeline continues, gap recorded |
| Bot-to-bot, undisclosed agent | DENY_DATA, stricter gate applied for ai_agent counterparty |
| Replay with external_calls_cached=false | Divergence detected, ATR-01 violation, replay flagged FAIL |
| Duplicate request_id (idempotency) | Identical trace returned, no duplicate event written |
4. Audit Readiness Model
An external auditor reconstructing a session from the event store can determine:
- When the call started and when the first disclosure statement was made
- Whether disclosure preceded any PHI or credential exchange
- Whether opt-out or escalation was requested and how it was handled
- Which policy engine version processed each event
- Whether any partial failures or boundary violations were recorded
Example correlation ID lifecycle:
correlation_id: "auth-2026-05-26-001" t=00:00.000 INGEST POST /voice/process received t=00:00.123 VALIDATE SpeechResult normalized t=00:00.131 STATE Session reconstructed: turn_count=0, disclosure=null t=00:00.140 POLICY IDG-01: DISCLOSE_IDENTITY triggered (turn_count=0) t=00:00.145 EXEC TwiML disclosure message rendered t=00:00.152 PERSIST Event written — disclosure_timestamp set
5. Architecture & Scale Notes
FastAPI + SQLite event store. Stateless policy engine. Suitable for development and self-validation. Not load-tested for production at scale.
Replace SQLite with Kafka or S3-backed event log. Policy engine is stateless and horizontally scalable — no changes required to the core engine.
Store input payload + policy version with each event. Any node can replay from the event store. Policy version change detection prevents silent audit corruption.
6. Risk Register
| Risk | Mitigation |
|---|---|
| Timestamps break exact replay | Hash computed over non-timestamp fields only; deterministic_hash excludes wall-clock values |
| Policy engine version change between runs | Policy version embedded in every event; replay rejects version mismatches |
| JSON key ordering variance | Canonical JSON (sorted keys) enforced before hashing |
| LLM re-invocation during replay | JSON Schema if/then enforces external_calls_cached=true when replay_mode=cached |
| partial_failure accumulation undetected | boundary_violations[] written per event; partial_failure rate trackable across sessions |
7. One-Page Architecture Summary
What it is: A lightweight, stateless service that logs AI voice agent disclosure behavior. Input: call events from Twilio or equivalent. Output: tamper-evident, deterministically reproducible trace with policy decision and boundary violations.
What it is not: A caller identity verifier, a certification body, or a compliance guarantor. Adoption does not confer HIPAA or TCPA compliance.
Event flow:
[AI Voice Agent] → INGEST → VALIDATE → STATE → POLICY → EXEC → PERSIST
↓
[Event Store]
↓
[Auditor / Payer System]
Open for feedback
Questions about implementation or adoption?
Reach out directly or join the community discussion.