[Atlas] Migrate pure-read GET endpoints to get_db_ro() replica pool

← All Specs

Goal

Route read-only GET endpoints onto the streaming replica pool to
deflect ~50-60% of primary-pool pressure. Builds on the RO pool
infrastructure (commit 03f91a907) and the pgBouncer scidex_ro
alias.

Changes

api_shared/db.py

  • _thread_local_db_ro — parallel threading.local for RO connections.
  • get_db_ro() — now thread-local cached (mirroring get_db()).
Same semantics: first call per request checks out a connection;
subsequent calls reuse; middleware closes at request end.
  • close_thread_local_dbs() — helper the middleware uses to close
both primary and RO wrappers in one call.

api.py

  • Middleware close_thread_db_connection switched from inlined
_thread_local_db cleanup to close_thread_local_dbs().
  • 274 pure-read GET handlers rewritten from get_db()get_db_ro().

api_routes/admin.py, api_routes/dedup.py, api_routes/epistemic.py

  • 20 additional pure-read GET handlers migrated.

Classification heuristic (safe)

A GET handler migrates ONLY when:
- Its body contains a get_db( call
- Its body contains ZERO of: INSERT INTO, UPDATE ... SET,
DELETE FROM, CREATE [TEMP] TABLE, DROP TABLE, ALTER TABLE,
CREATE INDEX, TRUNCATE, UPSERT, INSERT OR REPLACE,
INSERT OR IGNORE, ON CONFLICT
- Its body contains ZERO calls matching
record_edit, log_pageview, _write_pageview, write_audit,
save_, store_, record_, insert_, upsert_*,
update_, delete_, *.commit()
- (whitelist for false positives: commit_attribution, __save__,
save_state_dir, record_seen, save_ghost_snapshot)

Mixed handlers (pageview tracking, cache writes, index creation)
stay on get_db().

Counts:
- 297 pure-read GET handlers identified; 285 contain get_db( and
were migrated (the remaining 12 use a direct pool API or
already-async helper — not in scope).
- 9 mixed GET handlers left on primary.
- 130 GET handlers don't touch the DB through our helpers — no-op.

Acceptance

☐ Restart scidex-api — no startup errors.
☐ Hit migrated endpoints (/hypothesis/, /entity/,
/wiki/* direct-read variants, /senate, /exchange,
/showcase, /dashboard, etc.) — responses identical to
pre-migration (spot-check via curl -I status codes).
/health?pool=1 shows both pool_ (primary) and ro_
(replica) gauges populated.
pg_stat_activity on the primary shows reduced steady-state
pool_size after 10 minutes of traffic.
☐ Replica pg_stat_activity shows matching rise in connections.
☐ No cannot execute … in a read-only transaction errors in
scidex-api logs.

Rollback

Revert this commit. Old code reverts to get_db() for all handlers;
the RO pool becomes unused but harmless.

File: ro_pool_migration_2026_04_21_spec.md
Modified: 2026-04-24 07:15
Size: 2.9 KB