Goal
Close the remaining structured-evidence gap for active hypotheses so every live hypothesis has both evidence_for and evidence_against populated with PubMed-cited items.
Approach
Verify the current coverage in PostgreSQL to determine whether the task is still needed.
Add a PostgreSQL-safe backfill script that searches PubMed E-utilities for only the hypotheses still missing evidence.
Run the script, then verify coverage counts after the update.Acceptance Criteria
- A spec exists for this task and records the work.
- The repo contains a runnable backfill script for missing hypothesis evidence.
- All non-archived, non-rejected hypotheses have both
evidence_for and evidence_against after the backfill run.
Implementation
Add a targeted backfill utility under backfill/ rather than mutating older SQLite-era scripts in place. The new script should:
- connect directly to PostgreSQL with
psycopg
- preserve existing evidence arrays
- fetch PubMed titles, journal, and year for newly added PMIDs
- fill whichever side (
for or against) is still missing
- update
citations_count
Work Log
- 2026-04-24: Created spec after confirming the task had no
spec_path. Verified the live gap is smaller than the original task text: active hypotheses are complete on evidence_for, with 5 active hypotheses still missing evidence_against.