[Atlas] Cardio + infectious + metabolic + immuno gap importers (parallel to cancer) open

← Atlas
Four vertical-specific gap importers sharing a base class so non-cancer verticals reach gap-pool parity simultaneously.

Completion Notes

Rebase the branch onto current main (post-ae4a48e49 and post-92f2e60e1) so the watchlist + waste-detector features are preserved. After rebase, re-run the diff and confirm only the four gap-importer modules, _vertical_gap_base.py, deploy/scidex-*-gaps.{service,timer}, scripts/seed_nonland_gaps.py, tests/test_nonland_gap_importers.py, and the spec file are added — with no deletions of unrelated files. Re-run the test suite (including test_waste_detector.py and test_watchlist.py) to verify nothing in those features regressed. Changed files: - .orchestra-slot.json - agent.py - api.py - api_routes/forge.py - api_routes/senate.py - api_routes/watchlist_routes.py - api_shared/nav.py - deploy/scidex-cardio-gaps.service - deploy/scidex-cardio-gaps.timer - deploy/scidex-immuno-gaps.service - deploy/scidex-immuno-gaps.timer - deploy/scidex-infectious-gaps.service - deploy/scidex-infectious-gaps.timer - deploy/scidex-metabolic-gaps.service - deploy/scidex-metabolic-gaps.timer - docs/planning/specs/q-crowd-replication-bounties_spec.md - docs/planning/specs/q-notif-watchlist-engine_spec.md - docs/planning/specs/q-ri-waste-detector_spec.md - docs/planning/specs/q-tool-crispr-screen-mageck-pipeline_spec.md - docs/planning/specs/q-tool-structural-biology-pipeline_spec.md - docs/planning/specs/q-vert-infectious-cardio-metab-immuno-importers_spec.md - migrations/123_add_replication_contest_tables.py - migrations/20260428_watchlists.sql - migrations/add_crispr_screen_run.py - migrations/add_target_dossier_table.py - scidex/agora/skill_evidence.py - scidex/atlas/_vertical_gap_base.py - scidex/atlas/cardio_gap_importer.py - scidex/atlas/immuno_gap_importer.py - scidex/atlas/infectious_gap_importer.py - scidex/atlas/material_change.py - scidex/atlas/metabolic_gap_importer.py - scidex/atlas/watchlist_match.py - scidex/exchange/replication_contests.py - scidex/forge/crispr_libraries/brunello.tsv - scidex/forge/crispr_libraries/geckov2.tsv - scidex/forge/crispr_screen.py - scidex/forge/structural_biology.py - scidex/forge/tools.py - scidex/senate/scheduled_tasks.py Diff stat: .orchestra-slot.json | 2 +- agent.py | 21 +- api.py | 1215 ++------------------ api_routes/forge.py | 260 ----- api_routes/senate.py | 59 +- api_routes/watchlist_routes.py | 283 ----- api_shared/nav.py | 112 -- deploy/scidex-cardio-gaps.service | 16 + deploy/scidex-cardio-gaps.timer | 10 + deploy/scidex-immuno-gaps.service | 16 + deploy/scidex-immuno-gaps.timer | 10 + deploy/scidex-infectious-gaps.service | 16 + deploy/scidex-infectious-gaps.timer | 10 + deploy/scidex-metabolic-gaps.service | 16 + deploy/scidex-metabolic-g

Last Error

Review gate REJECT attempt 1/10: Branch is based off stale main; merging would delete the recently-shipped watchlist engine (PR #788) and waste detector (PR #786) — including api_routes/watchlist_routes.py, api_shared/nav.py, scidex/atlas/material_change.py, scidex/atlas/watchlist_match.py, scidex/senate/waste_detector.py, the 20260428_watchlists.sql migration, and the /watchlist page in api.py — causing breaking imports and route 404s.

Git Commits (1)

[Atlas] Cardio + infectious + metabolic + immuno gap importers (parallel to cancer) [task:97a80abb-655b-4d83-a842-560f33d391e1]2026-04-27
Spec File

Effort: thorough

Goal

Mirror q-vert-cancer-gap-importer for the four other verticals — one
importer module per vertical, each pulling 2-3 high-quality sources
specific to that field. Cardio mines GWAS cardio traits + UK Biobank
loci + AHA scientific statements. Infectious mines WHO outbreak reports +
GenBank pathogen submissions + ProMED feeds + the AMR literature.
Metabolic mines GWAS metabolic traits + DepMap metabolic dependencies
+ HMDB orphans. Immunology mines IEDB epitopes + ImmuneSpace flow
datasets + recent vaccine-development literature. Each emits gaps tagged
to the right MONDO ids and feeds the analogy engine and OPENQ ranker.

Why this matters

Without importers, the four non-cancer verticals stay mostly empty,
the analogy engine has nothing to match against, and the per-vertical
landing pages render empty-state placeholders forever. Implementing
all four in one task (sharing the gap_pipeline plumbing) is much
cheaper than four separate specs and ensures the verticals reach
parity simultaneously.

Acceptance Criteria

☐ Four new modules (≤300 LoC each):
- scidex/atlas/cardio_gap_importer.py — GWAS cardiac trait
REST + AHA Scientific Statement RSS + a curated PubMed query
for cardio mechanism uncertainty.
- scidex/atlas/infectious_gap_importer.py — WHO Disease
Outbreak News scrape + GenBank pathogen submissions
published in last 90 days (Entrez E-utilities) + ProMED-mail
digest parser + AMR literature query.
- scidex/atlas/metabolic_gap_importer.py — GWAS metabolic-trait
hits + DepMap metabolic-pathway dependency outliers
(KEGG metabolism module) + HMDB metabolites with
unresolved disease association.
- scidex/atlas/immuno_gap_importer.py — IEDB epitope database
recent-additions + ImmuneSpace HIPC trial digest +
recent-vaccine-failure literature.
☐ Each importer writes via gap_pipeline.create_gap with
vertical=<name>, mondo_id resolved, source_provenance JSON,
and the importer's deterministic dedup fingerprint.
☐ Single seed script scripts/seed_nonland_gaps.py runs all four
sequentially with --vertical filter; targets ≥300 gaps per
vertical after first run (≥1200 total).
☐ One systemd timer per vertical so failures in one don't block
the others (scidex-cardio-gaps.timer, etc.), each weekly on
a different day to spread load.
/atlas/landscape adds four vertical-tile cards next to the
cancer card from q-vert-cancer-gap-importer, each showing
open-gap count + last-import timestamp.
☐ Tests: per-vertical mock test that asserts gap rows are
MONDO-tagged correctly and de-duplicated against existing gaps.

Approach

  • The four importers share a base class in
  • scidex/atlas/_vertical_gap_base.py that handles MONDO resolution,
    dedup, provenance, and write — each subclass only implements the
    provider-specific fetch + extraction.
  • WHO outbreak page is HTML — use httpx + selectolax; cache
  • under data/who_outbreaks/<date>.html.
  • GenBank pathogen submissions via Entrez E-utils (existing pattern
  • in scidex/forge/tools.py:pubmed_search).
  • Each provider has a 3 req/s rate limit; central rate-limiter
  • handle from q-sand-rate-limit-aware-tools.

    Dependencies

    • q-vert-disease-ontology-catalog — MONDO resolver.
    • q-vert-cancer-gap-importer — pattern to mirror.
    • q-sand-rate-limit-aware-tools — provider rate-limiting.
    • gap_pipeline.py, gap_quality.py.

    Work Log

    Payload JSON
    {
      "_gate_retry_count": 1,
      "_gate_last_decision": "REJECT",
      "_gate_last_reason": "Branch is based off stale main; merging would delete the recently-shipped watchlist engine (PR #788) and waste detector (PR #786) \u2014 including api_routes/watchlist_routes.py, api_shared/nav.py, scidex/atlas/material_change.py, scidex/atlas/watchlist_match.py, scidex/senate/waste_detector.py, the 20260428_watchlists.sql migration, and the /watchlist page in api.py \u2014 causing breaking imports and route 404s.",
      "_gate_judge_used": "max_outlook1:claude-auto",
      "_gate_last_instructions": "Rebase the branch onto current main (post-ae4a48e49 and post-92f2e60e1) so the watchlist + waste-detector features are preserved.\nAfter rebase, re-run the diff and confirm only the four gap-importer modules, _vertical_gap_base.py, deploy/scidex-*-gaps.{service,timer}, scripts/seed_nonland_gaps.py, tests/test_nonland_gap_importers.py, and the spec file are added \u2014 with no deletions of unrelated files.\nRe-run the test suite (including test_waste_detector.py and test_watchlist.py) to verify nothing in those features regressed.",
      "_gate_branch": "orchestra/task/97a80abb-cardio-infectious-metabolic-immuno-gap-i",
      "_gate_changed_files": [
        ".orchestra-slot.json",
        "agent.py",
        "api.py",
        "api_routes/forge.py",
        "api_routes/senate.py",
        "api_routes/watchlist_routes.py",
        "api_shared/nav.py",
        "deploy/scidex-cardio-gaps.service",
        "deploy/scidex-cardio-gaps.timer",
        "deploy/scidex-immuno-gaps.service",
        "deploy/scidex-immuno-gaps.timer",
        "deploy/scidex-infectious-gaps.service",
        "deploy/scidex-infectious-gaps.timer",
        "deploy/scidex-metabolic-gaps.service",
        "deploy/scidex-metabolic-gaps.timer",
        "docs/planning/specs/q-crowd-replication-bounties_spec.md",
        "docs/planning/specs/q-notif-watchlist-engine_spec.md",
        "docs/planning/specs/q-ri-waste-detector_spec.md",
        "docs/planning/specs/q-tool-crispr-screen-mageck-pipeline_spec.md",
        "docs/planning/specs/q-tool-structural-biology-pipeline_spec.md",
        "docs/planning/specs/q-vert-infectious-cardio-metab-immuno-importers_spec.md",
        "migrations/123_add_replication_contest_tables.py",
        "migrations/20260428_watchlists.sql",
        "migrations/add_crispr_screen_run.py",
        "migrations/add_target_dossier_table.py",
        "scidex/agora/skill_evidence.py",
        "scidex/atlas/_vertical_gap_base.py",
        "scidex/atlas/cardio_gap_importer.py",
        "scidex/atlas/immuno_gap_importer.py",
        "scidex/atlas/infectious_gap_importer.py",
        "scidex/atlas/material_change.py",
        "scidex/atlas/metabolic_gap_importer.py",
        "scidex/atlas/watchlist_match.py",
        "scidex/exchange/replication_contests.py",
        "scidex/forge/crispr_libraries/brunello.tsv",
        "scidex/forge/crispr_libraries/geckov2.tsv",
        "scidex/forge/crispr_screen.py",
        "scidex/forge/structural_biology.py",
        "scidex/forge/tools.py",
        "scidex/senate/scheduled_tasks.py",
        "scidex/senate/waste_detector.py",
        "scripts/seed_nonland_gaps.py",
        "scripts/seed_replication_contests.py",
        "tests/test_crispr_screen.py",
        "tests/test_exchange_replication_contests.py",
        "tests/test_nonland_gap_importers.py",
        "tests/test_target_dossier.py",
        "tests/test_waste_detector.py",
        "tests/test_watchlist.py"
      ],
      "_gate_diff_stat": ".orchestra-slot.json                               |    2 +-\n agent.py                                           |   21 +-\n api.py                                             | 1215 ++------------------\n api_routes/forge.py                                |  260 -----\n api_routes/senate.py                               |   59 +-\n api_routes/watchlist_routes.py                     |  283 -----\n api_shared/nav.py                                  |  112 --\n deploy/scidex-cardio-gaps.service                  |   16 +\n deploy/scidex-cardio-gaps.timer                    |   10 +\n deploy/scidex-immuno-gaps.service                  |   16 +\n deploy/scidex-immuno-gaps.timer                    |   10 +\n deploy/scidex-infectious-gaps.service              |   16 +\n deploy/scidex-infectious-gaps.timer                |   10 +\n deploy/scidex-metabolic-gaps.service               |   16 +\n deploy/scidex-metabolic-gaps.timer                 |   10 +\n .../specs/q-crowd-replication-bounties_spec.md     |   28 +-\n .../specs/q-notif-watchlist-engine_spec.md         |   40 +-\n docs/planning/specs/q-ri-waste-detector_spec.md    |   25 -\n .../q-tool-crispr-screen-mageck-pipeline_spec.md   |   34 -\n .../q-tool-structural-biology-pipeline_spec.md     |   11 -\n ...nfectious-cardio-metab-immuno-importers_spec.md |   35 +\n migrations/123_add_replication_contest_tables.py   |  141 ---\n migrations/20260428_watchlists.sql                 |   37 -\n migrations/add_crispr_screen_run.py                |   56 -\n migrations/add_target_dossier_table.py             |   57 -\n scidex/agora/skill_evidence.py                     |   34 -\n scidex/atlas/_vertical_gap_base.py                 |  188 +++\n scidex/atlas/cardio_gap_importer.py                |  324 ++++++\n scidex/atlas/immuno_gap_importer.py                |  299 +++++\n scidex/atlas/infectious_gap_importer.py            |  316 +++++\n scidex/atlas/material_change.py                    |   91 --\n scidex/atlas/metabolic_gap_importer.py             |  359 ",
      "_gate_history": [
        {
          "ts": "2026-04-27 17:24:33",
          "decision": "REJECT",
          "reason": "Branch is based off stale main; merging would delete the recently-shipped watchlist engine (PR #788) and waste detector (PR #786) \u2014 including api_routes/watchlist_routes.py, api_shared/nav.py, scidex/atlas/material_change.py, scidex/atlas/watchlist_match.py, scidex/senate/waste_detector.py, the 20260428_watchlists.sql migration, and the /watchlist page in api.py \u2014 causing breaking imports and route 404s.",
          "instructions": "Rebase the branch onto current main (post-ae4a48e49 and post-92f2e60e1) so the watchlist + waste-detector features are preserved.\nAfter rebase, re-run the diff and confirm only the four gap-importer modules, _vertical_gap_base.py, deploy/scidex-*-gaps.{service,timer}, scripts/seed_nonland_gaps.py, tests/test_nonland_gap_importers.py, and the spec file are added \u2014 with no deletions of unrelated files.\nRe-run the test suite (including test_waste_detector.py and test_watchlist.py) to verify nothing in those features regressed.",
          "judge_used": "max_outlook1:claude-auto",
          "actor": "claude-auto:43",
          "retry_count": 1
        }
      ]
    }

    Sibling Tasks in Quest (Atlas) ↗