The 2026-04-24 recurring-block incident
(memory/project_orchestra_recurring_block_incident.md) and the
mermaid fence-stripping incident
(memory/project_scidex_mermaid_fence_incident.md) both share a shape:
a single agent / skill emitted bulk writes (165 task updates; ~5,800
wiki edits) before any human noticed. SciDEX has no per-actor rate
limit on artifact creation. This task adds a sliding-window circuit
breaker keyed on (actor_id, artifact_type) that trips when an actor
exceeds a configured rate and pauses further writes pending Senate
review. It uses a simple token bucket so legitimate batch jobs can be
allowlisted, but unattended runaway agents are stopped within seconds.
Effort: deep
scidex/senate/write_circuit_breaker.py:check(actor_id, artifact_type, *, conn) -> DecisionDecision = ('allow' | 'throttle' | 'trip', reason: str,
retry_after_s: int | None).CREATE TABLE actor_write_bucket (
actor_id TEXT NOT NULL,
artifact_type TEXT NOT NULL,
tokens DOUBLE PRECISION NOT NULL,
last_refill TIMESTAMPTZ NOT NULL,
tripped_at TIMESTAMPTZ,
tripped_reason TEXT,
PRIMARY KEY (actor_id, artifact_type)
); - Default policy: bucket capacity 60, refill 1 token/sec
(= 60/min sustained, 60 burst). Per-pair overrides in
scidex/senate/write_breaker_policy.yaml:
- *::wiki_page → cap 30, refill 0.1/s (== 6/min) — wiki
edits are slow, expensive, hard to revert.
- *::artifact_comment → cap 120, refill 2/s.
- *::artifact_link → cap 100, refill 1/s.
- senate.:: → cap 600, refill 10/s (Senate engines
allowlisted for high-throughput sweeps).
migrations/20260428_write_circuit_breaker.sql addsCREATE TABLE actor_write_trip_event (
id BIGSERIAL PRIMARY KEY,
actor_id TEXT NOT NULL,
artifact_type TEXT NOT NULL,
tripped_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
recent_writes INT NOT NULL,
window_seconds INT NOT NULL,
cleared_at TIMESTAMPTZ,
cleared_by TEXT
);check() from at minimum these write sites:scidex/atlas/wiki_writer.py upsert path.POST /api/comments in api.py.scidex/atlas/artifact_commit.commit_artifact.scidex/agora/crosslink_emitter.emit_link.scidex/atlas/artifact_registry.register_artifact.Decision.trip and raisesCircuitTripped(reason, retry_after_s); the API translates toorchestra senate clear-trip ...".
senate_alerts row of kind actor_write_trip and creates anactor_id, artifact_type,recent_writes, cleared_at IS NULL) with priority 95 so aPOST /api/senate/circuit_breaker/clear{actor_id, artifact_type} zeroes tripped_at and refills thecleared_by from the auth context.
orchestra senate breakers list shows the live bucketorchestra senate breakers clear actor_id type calls thedashboard_engine.py.
tests/test_write_circuit_breaker.py:SELECT … FOR UPDATE).SELECT … FOR UPDATE), and benchmark p99 < 5 ms per check.
with breaker_check(...) context manager so a refactor neverq-safety-emergency-pause — shares the senate_alerts table.q-safety-suspicious-pattern-detector — feeds N-write surges intoPrior attempts (2660c5ea2, 87e5a7ff4) had working code but were never merged
due to rebase conflicts in api_routes/senate.py (which had been corrupted to
a single-line placeholder by squash merge 9c19eed9d).
This run restored api_routes/senate.py from the last known-good commit
(0c3043394) and applied all circuit breaker changes cleanly on current main:
scidex/senate/write_circuit_breaker.py — token-bucket core + trip handlerscidex/senate/write_breaker_policy.yaml — per-type rate overridesmigrations/20260428_write_circuit_breaker.sql — two tables + indexesapi_routes/senate.py — restored + /api/senate/circuit_breaker/clear and/api/senate/circuit_breaker/trips endpoints
api.py — circuit breaker on POST /api/comments and artifact commentsscidex/atlas/artifact_commit.py — check before git lockscidex/atlas/artifact_registry.py — check on register_artifactscidex/core/db_writes.py — check on save_wiki_page (+ _actor_id param)scidex/senate/crosslink_emitter.py — check on emit_linksscidex/senate/dashboard_engine.py — circuit_breaker_trips + circuit_breaker_events evaluatorstests/test_write_circuit_breaker.py — 21 tests, all passing