rate_limit_retries_exhausted:max_gmail
> ## Continuous-process anchor
>
> This spec describes an instance of one of the retired-script themes
> documented in docs/design/retired_scripts_patterns.md. Before
> implementing, read:
>
> 1. The "Design principles for continuous processes" section of that
> atlas — every principle is load-bearing. In particular:
> - LLMs for semantic judgment; rules for syntactic validation.
> - Gap-predicate driven, not calendar-driven.
> - Idempotent + version-stamped + observable.
> - No hardcoded entity lists, keyword lists, or canonical-name tables.
> - Three surfaces: FastAPI + orchestra + MCP.
> - Progressive improvement via outcome-feedback loop.
> 2. The theme entry in the atlas matching this task's capability:
> A6, S4 (pick the closest from Atlas A1–A7, Agora AG1–AG5,
> Exchange EX1–EX4, Forge F1–F2, Senate S1–S8, Cross-cutting X1–X2).
> 3. If the theme is not yet rebuilt as a continuous process, follow
> docs/planning/specs/rebuild_theme_template_spec.md to scaffold it
> BEFORE doing the per-instance work.
>
> **Specific scripts named below in this spec are retired and must not
> be rebuilt as one-offs.** Implement (or extend) the corresponding
> continuous process instead.
Audit notebook links exposed from analysis and artifact-facing pages so SciDEX does not send users to empty, missing, or clearly stub notebook artifacts. The task should identify the common failure modes, repair the broken link generation or DB linkage paths that are still active, and document any residual cases that need a follow-on batch task.
api.py, artifact helpers, and notebook coverage scripts.quest-artifacts — Provides the ongoing artifact quality and notebook coverage mission context.fc23fb55-8d8 — Registers generated notebook stubs into artifact tables and informs current notebook-link expectations.quest-artifacts — Uses the repaired notebook-link path for future artifact quality sweeps./analyses/ surface and the matching PostgreSQL query for the default top-20 completed analyses ordered by top hypothesis score..ipynb files had code cells with zero stored outputs, and two active notebook rows pointed file_path/rendered_html_path at tiny Notebook Stub files while richer ID-matched notebooks existed on disk.notebook_detail path resolution so ID-matched site/notebooks/{notebook_id}.ipynb is preferred when the DB file_path is a tiny stub, and so rendered HTML lookup skips Notebook Stub / No Rendered Output / Notebook Not Rendered Yet candidates when a substantive fallback exists.python3 -m py_compile api.py passed; local api.notebook_detail(...) checks for the four repaired notebook IDs returned substantial pages (67KB, 73KB, 954KB, 1.06MB), no stub phrases, and no unexecuted banner.AGENTS.md, CLAUDE.md, the task spec, artifact governance notes, and the retired-script continuous-process guidance./analyses/ and /notebooks requests currently return HTTP 500 due to PostgreSQL pool timeout, so the top-20 audit used the same default /analyses/ query predicates directly against PostgreSQL: completed analyses with hypotheses/debates, ordered by top hypothesis score..ipynb code cells have zero stored outputs. Several duplicate lipid-rafts notebook rows point at the same underlying unexecuted file..ipynb notebooks from analysis, hypothesis, debate, and evidence rows for missing/stub records instead of inserting placeholders.Found: api.py:13605 was querying notebooks WHERE analysis_id = ? but the notebooks table schema uses associated_analysis_id (not analysis_id).
Fix: Changed line 13605 from:
nb_count = db.execute("SELECT COUNT(*) as c FROM notebooks WHERE analysis_id = ?", (ana_id,)).fetchone()["c"]to:
nb_count = db.execute("SELECT COUNT(*) as c FROM notebooks WHERE associated_analysis_id = ?", (ana_id,)).fetchone()["c"]Verification:
associated_analysis_id via .schema notebooksnotebooks WHERE analysis_id in api.pycurl http://localhost:8000/api/status returns 200 with valid JSONcurl http://localhost:8000/notebooks returns 200curl http://localhost:8000/notebook/{id} returns 200curl http://localhost:8000/analyses/{id} returns 200Audit findings (top 20 analyses by date):
status=failed got notebook records created (on analysis start) but the analyses failed before generating any hypotheses/debates, so no content was ever producedscripts/fix_active_stub_notebooks.py which:
status='active' notebooks with empty rendered_html_pathrendered_html_path, ipynb_path, file_pathForge cache usage:
nb-SDA-2026-04-11-...-112706-7f5a9480 (SEA-AD cell types) → seaad cache: 11 genes, 5 pubmed papers, STRING networknb-SDA-2026-04-12-...-112747-72269a36 (microglial states) → microglial_priming_ad cache: 6 genes, 1 paper, STRING networkcurl http://localhost:8000/notebook/{id} for 3 of 6 sampled pages51beaac2c)fix_active_stub_notebooks.py should be re-run if more analyses fail in the futureAudit findings (full notebook table):
Root cause:rendered_html_path was NULL for 138 active notebooks even though matching .html files existed under site/notebooks/. Two sub-causes:
file_path (.ipynb) set correctly but the companion .html path was never written back to rendered_html_path.file_path nor rendered_html_path — corresponding HTML files existed under common name variants (top5-<id>.html, <id>.html, nb-<id>.html, etc.).Fix:
scripts/fix_rendered_html_paths.py — two-pass scan that updates rendered_html_path (and file_path where missing) for all active stubs whose HTML exists on disk.Verification:
curl http://localhost:8000/notebook/nb-sda-2026-04-01-gap-004 → 200, 97K content, no "No Rendered Output" message.Root cause discovered:
78 draft notebook records (with IDs like nb-SDA-2026-04-01-gap-*, uppercase) were
registered as duplicates of existing active notebooks (nb-sda-*, lowercase) for the
same associated_analysis_id. Because none of the 4 notebook lookup queries in api.py
had a status='active' filter, these draft duplicates (with more recent created_at)
were returned instead of the real active notebooks, producing broken links.
Queries fixed in api.py (added AND status='active'):
scripts/archive_duplicate_draft_notebooks.py archived 78 duplicate draft notebooksb5e0a16c0Audit findings:
Root cause of regression: The prior fix (fix_rendered_html_paths.py) ran from within a bwrap sandbox that cannot see /home/ubuntu/scidex/site/notebooks/ (the sandbox restricts filesystem visibility). os.path.exists() returned False for all files in that directory even though they exist on the real filesystem and the live API can serve them. Additionally, 90 previously-archived notebooks appear to have been restored to active status, and 11 new notebooks added, both without rendered_html_path.Fix: Ran an updated fix script using the worktree's site/notebooks/ (405 HTML files visible there, same git content) as the existence-check proxy, writing canonical site/notebooks/<fname>.html paths back to the DB:
curl http://localhost:8000/notebook/nb-SDA-2026-04-03-26abc5e5f9f2 → <h1> (real content, was "No Rendered Output")curl http://localhost:8000/notebook/nb-sda-2026-04-01-gap-013 → <h1> (real content)curl http://localhost:8000/notebook/SEA-AD-gene-expression-analysis → <h1> (real content)href="/notebook/..." linksfix_rendered_html_paths.py script needs updating — it must use worktree site/notebooks/ as existence proxy (or skip existence checks) since the bwrap sandbox blocks OS-level file checks on the live site/ directory.rendered_html_path, they become stubs again. This recurring task should catch them each 6h cycle.Audit findings:
rendered_html_path=NULLsite/notebooks/ directory scan)notebook_detail at api.py:50716 only checked nb['rendered_html_path'] — it did NOT fall back to disk when that field was NULLfile_path/ipynb_path columns were populated for 136 of 468 active notebooks, but even those with file_path didn't serve the disk HTML (the render path never looked at disk)fix_rendered_html_paths.py) tried to update the DB, but failed because bwrap sandbox couldn't see /home/ubuntu/scidex/site/notebooks/ — the DB columns were never actually updatednotebook_detail (api.py ~50725):elif ipynb_local_path and ipynb_local_path.exists():
# Fallback: look for disk-based HTML alongside the .ipynb or in site/notebooks/
html_candidates = [
ipynb_local_path.with_suffix('.html'),
BASE / 'site' / 'notebooks' / f'{nb["id"]}.html',
]
for html_path in html_candidates:
if html_path.exists():
rendered_content = _darkify_notebook(html_path.read_text(encoding='utf-8'))
breakWhy this fix is correct (not just a workaround):
site/notebooks/ directory is the canonical static asset location served by nginxrendered_html_path is NULL but the HTML file exists at site/notebooks/<id>.html, that file is the rendered outputsite/notebooks/ first, then tries to update DB) — the DB update can fail silently, but the file is always thererendered_html_path — these work via disk fallback but should be cleaned up in DBAudit findings:
rendered_html_path=NULL → all showing "No Rendered Output" or "Notebook Stub"file_path column: site/notebooks/nb-SDA-2026-04-16-gap-*.ipynb → file does NOT exist on diskSDA-2026-04-16-gap-.ipynb (analysis ID) not nb-SDA- (notebook ID with nb- prefix)notebook_detail HTML disk-fallback was checking site/notebooks/{nb["id"]}.html, but the on-disk name is site/notebooks/{associated_analysis_id}.html_resolve_notebook_paths (api.py): when file_path points to a non-existent nb-*.ipynb, it never tried the {associated_analysis_id}.ipynb fallback pathnotebook_detail (api.py): the HTML disk-fallback only tried nb["id"]-based names, never associated_analysis_id-based namesFix — two changes in api.py:
_resolve_notebook_paths: added associated_analysis_id-based fallback after the id_path size check. When file_path doesn't exist, tries {associated_analysis_id}.ipynb then {associated_analysis_id}.html.notebook_detail HTML candidates: added elif assoc_id := nb.get('associated_analysis_id') branch to add site/notebooks/{assoc_id}.html to candidates when no ipynb file exists. Also deduplicated candidates via seen_html_candidates set.Why this is correct:
SDA-2026-04-16-gap-*), never with the nb- prefixfile_path column reflects how the notebook was originally registered, not where it actually livescurl http://localhost:8000/notebook/nb-SDA-2026-04-16-gap-pubmed-20260410-095709-4e97c09e → HTTP 200, content served from disk-based HTML fallback (no longer "No Rendered Output")import api; print('OK') → module loads without errorSDA-2026-04-16-gap-pubmed-20260410-095709-4e97c09e.html exists (1,874 bytes, stub content)rendered_html_path — work via disk fallback but DB should be updated in a follow-up batch taskIssue: Prior commit 526a8c380 bundled legitimate notebook path fixes with catastrophic API-breaking changes (removed api_routes.agora router, removed SecurityHeadersMiddleware, removed close_thread_local_dbs() from _market_consumer_loop). Merge gate rejected with "catastrophic API contract break."
Fix applied: Isolated the 88-line notebook path fix in api.py. Restored the catastrophic changes to match origin/main (keeping them NOT modified). The resulting diff is ONLY the two-function notebook fix:
_resolve_notebook_paths (api.py): when source_path is empty, tries {nb['id']}.ipynb. When local_path doesn't exist, tries associated_analysis_id-based paths.notebook_detail (api.py): extended html_candidates to include associated_analysis_id-based HTML when no ipynb exists. Added stub detection to skip "CI-generated notebook stub" HTML in favor of better alternatives.Verification:
git diff origin/main -- api.py: 0 occurrences of agora/SecurityHeaders/close_thread_local_db removals.curl http://localhost:8000/notebook/52c194a9... (empty-path notebook) → 33031 bytes, 2 imports, no "No Rendered Output" banner.{
"requirements": {
"analysis": 6,
"reasoning": 6,
"safety": 9
},
"completion_shas": [
"1f09a4461075fcc7ee1d482a24e6ce6941755317",
"4bde3cc30a850a997224ceb6a44e0e1aa54276a2",
"281e68478265280a6150cf58e66cc737e12d8576",
"51beaac2c7099ce87d015c16d5608b2f8d54e5b0",
"301dc7d80d5bc97bb13a11f6ed337d065866f8c8",
"75e45b23e817673ebfc3558ef2aa4ba3a8d4cc6f",
"9c92d5fe657b5010e976496ebdff2ed270ab3f3b",
"c67f106a3641d534cc9bdcdaa3b032ce071d11a2"
],
"completion_shas_checked_at": "2026-04-13T05:56:20.452449+00:00",
"completion_shas_missing": [
"5e74199190afcd99c6ecd47e38b9b3a29c6a11ee",
"f3aa31837f1f4d6533f9faed51f365b8e15e8300",
"094853ec54fcaae83aaadc6924c131042c018462",
"00569539cbeab3789863e5a19f51e21ae4642367",
"a9af5a683bc2c9705bf3864fb04db1c75308a809",
"c1e874dbf8c27624a7f25d99e33c235e1baa27b8",
"8172eb0256a5ebfb1f457d31a3ac9e0e30952d81",
"34954b59f4811985fba4e64bc6783b90c1c0e045",
"a6d4e0d4459cb483a72397f4b465b20ed8e6627e",
"bdbbb26e46335cce9d25c1d153f7d2b92a289a76",
"2e0fdba86844e0149c37ecf9871d24362950f4ce"
]
}