[Atlas] Read replica infra + get_db_ro() hook

Goal

Stand up a local PostgreSQL streaming replica to deflect read load
from the primary, and give the app a `get_db_ro()hook so read-heavy endpoints can opt in incrementally without a big-bang refactor.

`What landed`

`Infra (already live)`

Replica on port 5433, data dir /data/postgres-replica,

streaming from primary with async

 replication. Replay lag
  <10ms steady-state (

SELECT replay_lag FROM pg_stat_replication

).

systemd unit scidex-pg-replica.service with Restart=on-failure.

pgBouncer gains a second database alias scidex_ro pointing

at host=127.0.0.1 port=5433. Pool mode stays session

; same
  SCRAM auth as the primary alias.
Code (this PR)

api_shared/db.py:

_get_ro_pool() — lazy-init second ConnectionPool at

SCIDEX_PG_RO_DSN

 (env). When unset, falls through to the
  primary pool so local dev + tests don't need a replica.

get_db_ro() — public helper that returns a PGConnection

  from the RO pool. Callers close() it in a try/finally.

pool_stats() reports ro_* keys alongside primary.


Environment

/etc/scidex-pg.env gains:

SCIDEX_PG_RO_DSN=dbname=scidex_ro user=scidex_app password=... host=127.0.0.1 port=6432 SCIDEX_PG_RO_POOL_MIN=2 SCIDEX_PG_RO_POOL_MAX=40

`When to use` `get_db_ro()`

Good fits (safe, high-value):

/wiki/<slug>, /entity/<id>, /hypothesis/<id> detail pages

/api/graph/, /api/kg/ read endpoints


Dashboard aggregations / stats
Any GET endpoint that does not care about read-your-own-write


Do NOT use:
POST/PUT/DELETE handlers (the RO connection rejects writes)
Any endpoint that writes then immediately re-reads
Hot-write paths that need transactional guarantees

Follow-up

This PR adds the plumbing. A separate PR will migrate the top 10 read-heavy endpoints (identified viapg_stat_statementson the primary) toget_db_ro()` — should deflect ~60% of read traffic
based on query-mix observation.

File: read_replica_2026_04_21_spec.md

Modified: 2026-04-28 03:24

Size: 2.1 KB

[Atlas] Read replica infra + get_db_ro() hook

Goal

What landed

Infra (already live)

Code (this PR)

Environment

When to use get_db_ro()

Follow-up

`What landed`

`Infra (already live)`

`When to use` `get_db_ro()`