SciDEX writes audit-relevant rows into multiple tables —
task_events, artifact_provenance, senate_alerts,
comment_classifier_runs, sandbox_audit_event, the new
actor_write_trip_event — but each row stands alone. An operator with
DB write access (or a compromised migration script) can DELETE FROM and leave no trace. This task makes
artifact_provenance WHERE id = X
the audit log append-only-by-construction: each new row is
hash-chained to its predecessor, the chain head is published to a
durable external sink, and a verifier detects any retroactive edit or
deletion. The threat model is "one of our own DBAs / agents /
migrations is malicious or compromised", and the integrity guarantee
is "tampering is detectable within one verification cycle even if the
attacker has full DB write authority".
Effort: thorough
migrations/20260428_audit_chain.sql:CREATE TABLE audit_chain (
seq BIGSERIAL PRIMARY KEY,
event_kind TEXT NOT NULL, -- mirrors source table name
event_pk TEXT NOT NULL, -- foreign-row PK as text
event_hash BYTEA NOT NULL, -- SHA-256 of canonical event payload
prev_hash BYTEA NOT NULL, -- previous row's chain_hash, all zeros for genesis
chain_hash BYTEA NOT NULL, -- SHA-256(prev_hash || event_hash || seq::bytea)
appended_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (event_kind, event_pk)
);
CREATE INDEX idx_ac_kind_pk ON audit_chain (event_kind, event_pk);scidex/senate/audit_chain.py:append(conn, event_kind: str, event_pk: str, payload: dict)json.dumps(payload, sort_keys=True,
separators=(',',':'), default=str)), hashes, takes a rowaudit_chain, computeschain_hash, and inserts. UsesSELECT ... FROM audit_chain ORDER BY seq DESC LIMIT 1 FOR
UPDATE to serialise.verify(start_seq: int = 1, end_seq: int | None = None)
-> VerifyReport walks the chain and re-derives eachchain_hash, returning the first divergence (or None).
audit_chain automatically:CREATE TRIGGER trg_artifact_prov_chain
AFTER INSERT ON artifact_provenance
FOR EACH ROW EXECUTE FUNCTION audit_chain_append('artifact_provenance'); The PL/pgSQL function audit_chain_append mirrors
Python's append logic. Apply triggers to:
artifact_provenance, task_events, senate_alerts,
actor_write_trip_event, sandbox_audit_event,
senate_pause, audit_log (if exists; create if not).
(seq, chain_hash, appended_at) to/data/audit/chain_head.log (a separate filesystem path, alsodata/scidex-papers/audit_chain_head.log so itverify() against the previous-day window. Any divergencesenate_alerts row of kind='audit_chain_break'severity='critical' and the offending seq/event_pk.
scripts/backfill_audit_chain.py walks thescidex_audit_ro isSELECT only on audit_chain; the verifiertests/test_audit_chain.py:event_hash) → expects divergence at exactly that seq;pgbench-style load (use psycopg2 ThreadedConnectionPool).q-safety-emergency-pause — supplies senate_alerts table.q-trust-governance-transparency-log — uses the chain to attestq-trust-provenance-integrity-scanner — uses the chain as groundmigrations/20260428_audit_chain.sql — CREATE TABLE audit_chain, CREATE INDEX, audit_log table, audit_chain_append() trigger function using digest() (pgcrypto), triggers on 5 tablesscidex/senate/audit_chain.py — append(), verify(), get_head(), publish_head(), verify_and_alert() with VerifyReport dataclassaudit_chain_append() PostgreSQL function using digest() from pgcrypto (sha256() only accepts bytea). Verified trigger fires correctly on INSERT; audit_chain rows created for senate_alerts inserts.scripts/backfill_audit_chain.py — handles tables with non-standard PKs (artifact_provenance uses action_id), dedup within runtask_events, actor_write_trip_event, senate_pause tables do not exist on current DB; triggers for existing tables only. The Python module provides append() for programmatic use.{
"completion_shas": [
"2bf497372"
],
"completion_shas_checked_at": ""
}