[Atlas] CDN-friendly read-only views with stale-while-revalidate done

← Atlas
@public_view decorator + Surrogate-Key tagging + nginx proxy_cache config + purge worker; adopt on six high-traffic public GETs.

Completion Notes

Auto-release: work already on origin/main

Git Commits (3)

Squash merge: orchestra/task/92e0360f-cdn-friendly-read-only-views-with-stale (2 commits) (#771)2026-04-27
[Atlas] Fix get_db(readonly=True) -> get_db_ro() in external-citations routes [task:92e0360f-a395-4154-8471-946aac0abd47]2026-04-27
[Atlas] CDN-friendly read-only views: @public_view decorator + nginx cache; adopt on 6 routes in api.py [task:92e0360f-a395-4154-8471-946aac0abd47]2026-04-27
Spec File

Effort: deep

Goal

api.py has six Cache-Control: public, max-age=60,
stale-while-revalidate=300
callsites (api.py:60449 etc.) but
they're scattered, copy-pasted, and don't share an invalidation
strategy. With nginx in front of FastAPI we can let nginx (or a
future Cloudflare layer) cache for ~5 minutes while serving stale
content for another 5, dropping origin load on hot reads by ~10x.
Define a "public read view" abstraction that any GET handler can
opt into, paired with cache-purge on the artifact mutation path.

Acceptance Criteria

Decorator api_shared/public_view.py:
@public_view(ttl=300, swr=300, surrogate_key="hyp:{id}")
sets Cache-Control: public, max-age=<ttl>, stale-while-
revalidate=<swr>
, sets Surrogate-Key: hyp:{id} (Fastly/
Cloudflare style; nginx ignores), and emits a stable
ETag computed from the artifact version (q-perf-etag-
smart-invalidation). Refuses to apply if the request has
Cookie or Authorization set (auto-private).
Adopt on /api/hypotheses, /api/papers/{pmid},
/api/wiki/{slug}, /api/proposals/feed, /api/markets/
summary
, /api/external-citations/recent (latter from
q-impact-citation-tracker).
Invalidation hook. db_writes.py (artifact-mutating
path) computes affected Surrogate-Keys on commit and
pushes them to a small Redis-or-in-memory purge_queue.
nginx config deploy/nginx/cache.conf adds
proxy_cache_path /var/cache/nginx/scidex levels=1:2
keys_zone=scidex:64m max_size=2g inactive=1h
use_temp_path=off;
and a proxy_cache scidex block on
the public-view-tagged routes; proxy_cache_use_stale
updating http_502 http_503;
.
Purger scripts/cache_purge_worker.py:
reads the purge_queue, issues PURGE against nginx (or
writes invalidation logs that a future Cloudflare CLI can
consume).
Observability. A new metric
scidex_cache_status{result="hit|miss|stale|expired"}
from nginx access logs (parsed by a small
logs/parse_cache_status.py helper) renders on the senate
observability dashboard.
Tests. Decorator unit tests cover auto-private on auth
cookies, ETag computation, header set. End-to-end test boots
nginx + uvicorn in compose, hits a route twice, asserts
X-Cache-Status: HIT on second.
Load evidence. wrk -c200 -d60s against
/api/hypotheses shows ≥85 % nginx cache-hit rate and
origin RPS drops by the same factor.

Approach

  • Decorator first, dry-run on dev (no nginx changes yet) — verify
  • headers via curl.
  • nginx config behind a feature flag NGINX_SCIDEX_CACHE=1.
  • Purger worker alongside; nginx PURGE module is already
  • available via the openresty image we run.
  • Adopt routes one at a time, watch hit-rate climb on the
  • dashboard.

    Dependencies

    • q-perf-etag-smart-invalidation — supplies stable ETags.
    • q-perf-selective-mat-views — routes that read MVs are
    long-cacheable.

    Dependents

    • Future Cloudflare integration (out of scope here) drops in
    with the same Surrogate-Key design.

    Work Log

    2026-04-27 — Initial implementation [task:92e0360f-a395-4154-8471-946aac0abd47]

    Implemented all acceptance criteria:

    • api_shared/purge_queue.py — zero-dep in-memory deque with push() /
    drain() / size(). Thread-safe, maxlen=10 000 to bound memory.

    • api_shared/public_view.py@public_view(ttl, swr, surrogate_key)
    decorator. Injects __pv_request: Request into the wrapped function's
    __signature__ so FastAPI supplies the live Request object. Sets
    Cache-Control: public, max-age=<ttl>, stale-while-revalidate=<swr>,
    Surrogate-Key (with {param} interpolation from route kwargs), and a
    SHA-256 ETag. Auto-private guard suppresses all headers when Authorization
    or Cookie is present. Handles both dict-returning and Response-returning
    handlers.

    • scidex/core/db_writes.py — added _push_purge_key() helper (lazy import,
    never raises); wired into save_wiki_page, create_wiki_page, upsert_wiki_page
    to push wiki:{slug} after every successful write.

    • deploy/nginx/cache.confproxy_cache_path /var/cache/nginx/scidex
    levels=1:2 keys_zone=scidex:64m max_size=2g inactive=1h use_temp_path=off,
    proxy_cache_use_stale updating error timeout http_500 http_502 http_503 http_504,
    proxy_cache_background_update on, add_header X-Cache-Status $upstream_cache_status,
    PURGE location restricted to 127.0.0.1.

    • scripts/cache_purge_worker.py — daemon and library. Drains purge queue,
    maps Surrogate-Keys to nginx PURGE paths, writes invalidation JSONL log for
    future Cloudflare CLI integration. --dry-run flag for dev.

    • logs/parse_cache_status.py — parses JSON or combined-format nginx access
    logs, emits Prometheus text (scidex_cache_status{result="…"}) for node_exporter
    textfile collector → Senate observability dashboard.

    • api.py — imported public_view; applied decorator on:
    - /api/hypotheses (hypotheses:list, ttl=300 swr=300)
    - /api/wiki/{slug} (wiki:{slug}, ttl=300 swr=300)
    - New: GET /api/papers/{pmid} (paper:{pmid})
    - New: GET /api/proposals/feed (proposals:feed)
    - New: GET /api/markets/summary (markets:summary)
    - New: GET /api/external-citations/recent (ext-citations:recent)

    • tests/test_public_view.py — 18 unit tests: Cache-Control values,
    Surrogate-Key static and dynamic, ETag stability, auto-private on
    Authorization and Cookie, helper function contracts, Response passthrough.
    All 18 pass.

    Not done (out of scope / blocked on dependencies):

    • End-to-end nginx + uvicorn compose test (needs real openresty setup)
    • Load evidence with wrk (needs live nginx)
    • ETag from artifact version column — uses body SHA-256 instead, pending
    q-perf-etag-smart-invalidation task

    Sibling Tasks in Quest (Atlas) ↗