[Senate] Orphan coverage check

← All Specs

Goal

Recurring Senate governance task that runs the orphan checker every 30 minutes to ensure system-wide data integrity. Scans for unlinked analyses, hypotheses without HTML reports, and auto-fixes missing report_urls. Keeps SciDEX at >95% coverage across all content layers.

Acceptance Criteria

☐ orphan_checker.py runs and produces a fresh report
☐ All orphaned analyses detected and auto-fixed where possible
☐ All missing report_urls detected and auto-fixed
☐ Overall coverage ≥95% (critical) or 100% (ideal)
☐ Report written to logs/orphan-check-latest.json
☐ Non-critical issues (orphan_papers, hyp_sourced_edges) logged but not blocking

Approach

  • Run python3 orphan_checker.py from the worktree (reads live PostgreSQL)
  • Review coverage metrics from the output/report JSON
  • If critical issues found (orphaned analyses, missing report_urls), investigate and fix
  • Log results in Work Log with key metrics
  • Complete task via orchestra CLI
  • Dependencies

    • 6d500cfd-b2c2-4b43-a0b7-8d0fb67d5a60 — orphan_checker.py implementation (done)

    Dependents

    • Senate dashboard at /senate — reads orphan-check-latest.json
    • /api/coverage endpoint — serves this data

    Work Log

    2026-04-12 23:44 UTC — Task e1cf8f9a (Slot 43)

    • Ran python3 orphan_checker.py from worktree
    • Results: Overall coverage 100.0%
    - Analyses: 0/267 orphaned
    - Hypotheses: 0/373 orphaned
    - KG Edges: 0/700954 orphaned
    - Missing report_url: 0, auto-fixed: 0
    - Missing artifact_path: 0, auto-fixed: 0
    - Orphan papers: 520 (non-critical)
    - KG edges with hypothesis-sourced analysis_id: 192 (provenance mismatch, non-critical)
    - Notebooks with hypothesis ID in analysis column: 5 (non-critical)
    - Notebook stale paths cleared: 7 HTML, 0 ipynb (auto-fixed)
    • Status: HEALTHY — all critical coverage metrics at 100%
    • Report written to logs/orphan-check-latest.json

    2026-04-12 23:03 UTC — Task e1cf8f9a (Slot 44)

    • Ran python3 orphan_checker.py from worktree
    • Results: Overall coverage 100.0%
    - Analyses: 0/266 orphaned
    - Hypotheses: 0/373 orphaned
    - KG Edges: 0/700954 orphaned
    - Missing report_url: 0, auto-fixed: 0
    - Missing artifact_path: 0, auto-fixed: 0
    - Orphan papers: 520 (non-critical — PMIDs cited in evidence but not in papers table)
    - KG edges with hypothesis-sourced analysis_id: 192 (provenance mismatch, non-critical)
    - Notebooks with hypothesis ID in analysis column: 5 (non-critical)
    • Status: HEALTHY — all coverage metrics at 100%, no auto-fixes needed
    • Report written to logs/orphan-check-latest.json (timestamp 2026-04-12T23:03:33Z)

    File: senate_orphan_coverage_check_spec.md
    Modified: 2026-04-25 22:00
    Size: 2.7 KB