SciDEX — Task: [Senate] N+1 query detector + auto-batch helpers f

Per-request query-template counter; senate dashboard of top offenders; batch_in/batch_lookup/prefetch helpers; pytest no_n_plus_one marker.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (1)

[Senate] N+1 query detector + auto-batch helpers (api.py, api_shared) [task:256e38e5-fc4f-4b92-91ec-0b1aa9ba180b] (#758)2026-04-27

Spec File

Effort: deep

Goal

api.py is full of code patterns like for hyp in hypotheses: row = db.execute("SELECT ... WHERE id=?", (hyp.id,)) —
the canonical N+1 anti-pattern. With 310 hypotheses headed to 10K
and 15K papers headed to 1M, the same code that returns in 80ms
today will return in 8s. Ship a runtime detector that flags routes
that issue >N queries against the same statement template within a
single request, plus a small helper library to convert those
patterns to WHERE id = ANY($1) batch calls.

Acceptance Criteria

☐ Detector api_shared/n_plus_one_detector.py:

hooks the request-scoped DB holder
(_request_db_holder in api_shared/db.py) and counts
executions per (query_template_hash, route_label) per
request. A query template is the SQL with all literals replaced
by $N. When the count exceeds N_PLUS_ONE_THRESHOLD=8 for
a single template, log a structured WARN with route, hash,
template snippet, and total time.

☐ Sampling. 100 % in dev, 1 % in prod (env-flag).

☐ Storage. New table n_plus_one_incidents (route,

query_template_hash, query_template, count, total_ms,
observed_at). Aggregated across requests so the senate page
shows stable hot offenders.

☐ Page GET /senate/n-plus-one lists top offenders by

count × frequency, with a "view sample request" expander
that uses the q-obs-trace-id-propagation lookup endpoint to
replay the full causal chain.

☐ Helper library api_shared/batch_query.py:

- batch_in(db, sql_template, ids, chunk=500) -> list[dict]
— runs WHERE id = ANY($1) chunked.
- batch_lookup(db, table, ids, columns) -> dict[id, row] —
common pattern for hydration.
- prefetch(rows, fk_field, table, columns) -> list[rows] —
N+1 → 1 join, returns rows with the FK target merged.

☐ Adopt on top 3 offenders found by the detector during

its first 24 hours. Document the before/after query counts in
the Work Log. Each rewrite is its own PR.

☐ Tests tests/test_n_plus_one_detector.py: synthetic

handler issues 50 lookups → detector warns once with
count=50. Helper round-trips batch_in against a 10K-row
fixture in <50 ms.

☐ CI ratchet. A new pytest marker

@pytest.mark.no_n_plus_one fails any test that triggers a
detector warning during execution. Add to 5 existing route
integration tests as a starter set.

Approach

Detector first as opt-in; verify zero perf overhead under

wrk load.

Helpers — keep stateless and explicit (no auto-rewriting of

existing code).

Pick the top 3 offenders the detector finds and submit

batch-rewrite PRs (one per route).

The CI ratchet ratchets up coverage as more tests adopt the

marker.

Dependencies

q-obs-trace-id-propagation — for replay-sample button.
q-perf-hot-path-optimizer — supplies route labels.

Dependents

q-perf-deferred-work-queue — many N+1 sites are heavy

hydration that should defer instead of batch.

Work Log

2026-04-27 — Implementation (task:256e38e5-fc4f-4b92-91ec-0b1aa9ba180b)

Files created / modified:

api_shared/n_plus_one_detector.py — ContextVar-based per-request query counter.

Normalizes SQL (strips all literals → $N), tracks count + total_ms per template hash per request.
At flush time (close_request_db_scope), emits one WARN per offending template with
route, hash, count, total_ms, and template snippet. Upserts to n_plus_one_incidents.
Sampling: 100% in dev (ENVIRONMENT != 'prod'), configurable via N_PLUS_ONE_SAMPLE_RATE (default 1% in prod).

api_shared/batch_query.py — Three stateless helpers:

- batch_in(db, sql_template, ids, chunk=500) — replaces {ids} token with ANY(%s), chunked.
- batch_lookup(db, table, ids, columns) — returns dict[id → row_dict] for FK hydration.
- prefetch(rows, fk_field, table, columns) — one-shot N+1 → batch pattern; merges FK data under {fk_field}_data.

api_shared/db.py — Three hook points added: open_request_db_scope() opens detector scope;

PGConnection.execute() calls record_query(); close_request_db_scope() calls flush_request().
All via lazy import + bare except so detector is never on the critical failure path.

migrations/20260427_n_plus_one_incidents.sql — Creates n_plus_one_incidents table with

UNIQUE(route, query_template_hash) for upsert aggregation. Applied to local DB.

tests/test_n_plus_one_detector.py — 19 unit tests (all passing). Includes: normalize_sql stability,

detector warns once with count=50, below-threshold no-warn, independent template tracking,
sampled-out no-op, batch_in/batch_lookup/prefetch mock tests, and one @pytest.mark.integration
test for batch_in against live 10K-row table.

tests/conftest.py — Created _no_n_plus_one_guard autouse fixture; installs log handler during

@pytest.mark.no_n_plus_one tests, fails on any detector WARNING.

pyproject.toml — Registered no_n_plus_one marker.
api.py — Added GET /senate/n-plus-one route after /senate/hot-paths. Shows top 50 offenders

by count DESC, summary stat cards (distinct offenders, total redundant queries, total wasted ms),
and a "How to Fix" code example using batch_lookup.

Acceptance criteria status:

☑ Detector + N_PLUS_ONE_THRESHOLD=8 + WARN logging

☑ 100% dev / 1% prod sampling

☑ n_plus_one_incidents table (migration applied)

☑ /senate/n-plus-one page (no trace-replay button — q-obs-trace-id-propagation still open)

☑ batch_query.py with batch_in, batch_lookup, prefetch

☐ Top-3 offender rewrites — deferred; detector needs 24h of prod traffic first

☑ Tests (19 unit + 1 integration marker)

☑ no_n_plus_one pytest marker + conftest enforcement