Tampering and replay

Two related threats sit at the heart of any tool that holds long-lived supervisory evidence. Tampering: an actor (an insider, a compromised session, a database operator with direct access) edits an AuditEvent after the fact to remove evidence of a supervisory failing or to fabricate one. Replay: a supervisory act is recorded twice, or a stale state is re-submitted to roll back an event that has already happened.

Lending Agent Oversight addresses both with three layered controls: SHA-256 hash chaining on every AuditEvent, append-only storage at the database layer with an off-platform durable copy under object lock, and a state machine that makes terminal actions idempotent and rejects replays.

The audit chain

Every substantive write emits an AuditEvent synchronously inside the same transaction. The shape (from lib/types.ts):

export interface AuditEvent {
  id: Ulid; tenantId: Ulid; at: IsoTimestamp;
  actorUserId: Ulid | null; actorRole: Role;
  action: string;                       // e.g. "breach.notify-fca"
  subjectType: "ar" | "breach" | "review" | "annual-review" | "mi-return" | "tenant" | "user";
  subjectId: Ulid;
  ip: string | null; userAgent: string | null;
  prevHash: string;                     // SHA-256 of prior event
  hash: string;                         // SHA-256 of this event
}

hash is SHA-256 over the canonical JSON representation of the event excluding the hash field, so the hash covers prevHash and therefore covers the entire chain to that point. The first event in a tenant’s chain (tenant.created) carries prevHash = "0".repeat(64). Every subsequent event carries the prior event’s hash as its prevHash.

Hash-chain mode is per-tenant; each tenant has its own chain, anchored at the tenant’s creation event. Per-tenant-year shards were considered and rejected: the operational simplicity of one chain per tenant outweighs the marginal performance benefit, given the realistic event volume (low thousands per tenant per year).

Append-only storage

The audit_events Postgres table grants the application role INSERT only. UPDATE and DELETE are revoked at role grant time. A BEFORE TRUNCATE trigger raises EXCEPTION 'audit_events: TRUNCATE blocked'. Schema:

CREATE TABLE audit_events (
  id           ULID         PRIMARY KEY,
  tenant_id    ULID         NOT NULL,
  at           TIMESTAMPTZ  NOT NULL,
  actor_user_id ULID,
  actor_role   TEXT         NOT NULL,
  action       TEXT         NOT NULL,
  subject_type TEXT         NOT NULL,
  subject_id   ULID         NOT NULL,
  ip           INET,
  user_agent   TEXT,
  prev_hash    CHAR(64)     NOT NULL,
  hash         CHAR(64)     NOT NULL UNIQUE
);

REVOKE UPDATE, DELETE ON audit_events FROM app_role;
GRANT SELECT, INSERT ON audit_events TO app_role;

CREATE TRIGGER audit_events_no_truncate
  BEFORE TRUNCATE ON audit_events
  EXECUTE FUNCTION raise_exception('audit_events: TRUNCATE blocked');

Row-level security on tenant_id = current_setting('app.tenant_id')::uuid ensures cross-tenant reads are impossible from the application path.

Off-platform durable copy

The audit chain ships nightly to S3 (or S3-compatible storage) with Object Lock Mode = COMPLIANCE and Retain Until Date = ts + 10y. Compliance-mode object lock means the bucket owner cannot delete the object until the retention expires; even an AWS root credential cannot remove it.

The roll-up writes JSONL files keyed by (tenantId, date) to a per-tenant prefix. Each file is accompanied by a SHA-256 manifest that lists the included event ids and the chain head at the moment of the roll-up. A regulator-led restoration from the durable copy is the recovery path if the live database is compromised.

Ten years on the chain against the SYSC 9 floor of seven years gives the operator three years of margin for legal hold, dispute, and the lag between a breach happening and the regulator opening an investigation.

Daily integrity check

A scheduled job runs at 03:30 UTC daily per tenant. It walks the chain in order and verifies that for every event_n, event_n.prevHash == sha256(event_(n-1) - hash field) and event_n.hash == sha256(event_n - hash field). The job runs in batches of 10,000 events with a streaming cursor; for the operational scale (low thousands of events per tenant per year), the full walk completes in under a minute per tenant.

A mismatch is a P1 incident:

The integrity-check job writes a chain.mismatch event to a separate operational log (not into the tenant’s audit chain, since the chain itself is suspect).
The workspace shows a banner to all principal-side users in the affected tenant: “audit chain integrity warning, contact support”.
The off-platform durable copy is read in parallel and compared, identifying the divergence point.
The on-call engineer reconstructs the canonical chain from the durable copy and the operational journal, then writes a chain.repaired event signed by the operator’s keys.

The mismatch banner is shown for legal reasons: a regulator looking at the workspace must know that the chain has been disturbed, even if the underlying cause was a database bug rather than malice.

Why SHA-256 in a chain, not a Merkle tree

SHA-256 in a linear chain gives tamper-evidence (any retroactive edit detaches the chain at the edit point and forward) at minimal operational cost. A Merkle tree would give tamper-evidence with O(log n) proofs against a published root, which is useful for public verifiability but unnecessary for a private supervisory log. The hash chain is the right primitive for the threat profile.

A higher-assurance time anchor (RFC 3161 timestamps from a Time Stamp Authority, or transparency-log style anchoring) is not in scope for v1 and is documented as a possible production hardening step.

Replay defence

Replay attacks at the API layer are addressed by three controls:

Optimistic concurrency on writes. Every PATCH carries If-Match against the prior updatedAt. A replayed write fails with 412 because the entity has moved on.
State-machine gating. The breach state machine, the file review state machine, and the annual review state machine each reject illegal transitions. A replayed notify-fca against a breach already in NotifiedFca returns 409.
Idempotency keys on terminal actions. POST /api/breaches/:id/notify-fca, POST /api/annual-reviews/:id/sign-off, and POST /api/mi-returns/:id/submit accept an optional Idempotency-Key header (a client-generated ULID) and dedupe on it for 24 hours.

The MI return submission is idempotent on (arId, period) regardless of Idempotency-Key: a second submit for the same AR and the same quarter returns the existing record. Submission is the terminal transition for the MI state machine; once submitted, the AR’s metrics are immutable.

Replay as evidence

The same audit chain that defends against tampering serves a positive function: every supervisory act can be replayed with the state at the time. A regulator (or a court) can ask, “what did the principal-firm see when they signed off this AR’s annual review on this date” and the answer is reconstructable from the chain.

The annual review packet is built from breachSummaryRefs, fileReviewSummaryRefs, miReturnRefs, and conductEventRefs. Each reference resolves to an audit event id at sign-off time; replay reads those events in chain order and reconstructs the packet’s data exactly as the director saw it. If a file review has been challenged and re-completed since, the replay shows the version as of sign-off, not the latest. This is the basis for a defence under PS22/11 enhanced oversight or a SUP 15 enforcement question.

Because the verbatim text of risk-scoring weights, file-review findings, and director sign-off notes is stored in the audit event (not just a “version: 3” pointer), a future change to the underlying configuration or copy does not corrupt historical records. Each event is self-contained.

Time integrity

Audit timestamps are server-assigned at the moment of the database write, with millisecond resolution. The Vercel platform synchronises clocks via NTP. For ordering across server boundaries, the millisecond timestamps plus the chain’s prevHash sequence are sufficient at the operational scale.

The clock is not used for authentication of any external party. Token expiry on session cookies is server-checked against server time; clock skew on a client device cannot extend a session.

What this page does not cover

This page covers evidence integrity. The privacy chapter (retention) covers how long records are kept and on what trigger they are purged. The safety chapter’s insider threat page covers the structural defences against principal-firm abuse of the workspace, which depend on the integrity guarantees set out here.