[Atlas] Per-hypothesis explainability tear-sheet - one-page visual done

← Live Dashboard Artifact Framework
One-page printable hypothesis dashboard: radar, score sparkline, top supporting/contradicting cites, decision tree, prereg outcomes, 30d delta.

Completion Notes

Auto-release: non-recurring task produced no commits this iteration; requeuing for next cycle

Git Commits (1)

[Atlas] Per-hypothesis explainability tear-sheet [task:7ead4f85-bf99-4a7c-9498-15fa742ebddd]2026-04-27
Spec File

Goal

A hypothesis page today is a wall of text: title, claim, predictions,
PMIDs, scores. There's no single page that shows why a hypothesis
has its current Elo and composite score in visual form. Build a
one-page "tear-sheet" (printable, shareable) that crunches the
hypothesis's complete evidence + debate + score history into a
dashboard: small-multiples for evidence quality, a score-evolution
spark, top-3 supporting and top-3 contradicting citations, the
Synthesizer's decision-tree summary, the per-dimension score radar,
and a "what changed in last 30d" diff. This is the artifact a
researcher prints to make a go/no-go decision.

Effort: thorough

Acceptance Criteria

GET /hypothesis/{id}/tearsheet renders a self-contained HTML page (no SPA shell) with these panels, all server-rendered:
- Header: title, slug, current composite_score, current elo_rating (from elo_ratings arena='global'), strength badge (from q-qual-claim-strength-normalizer), "last reviewed" timestamp.
- 10-dim radar chart (uses existing per-dim scores; SVG, no JS dependency).
- Score-evolution sparkline: composite_score over time from hypothesis_score_history (create the table if it doesn't exist; backfill from audit_log rows).
- Top-3 supporting citations (rank by citation_validity.support_verdict='supports' × evidence_quality_tier); each row shows PMID, journal, year, support quote, fact-check verdict from q-qual-auto-fact-check-pipeline.
- Top-3 contradicting citations (same shape, verdict='contradicts').
- Decision-tree summary panel from q-viz-decision-tree-synthesis (top 5 branches by weight).
- Per-prediction outcome table (from preregistration_outcomes): predicted_probability, observed_outcome, calibration delta.
- "Last 30 days" delta block: net change in composite, Elo, evidence count, market price (if a market exists).
- Footer: "generated at <ts>", static URL for sharing, print-stylesheet.
☐ Optional ?format=pdf query param uses weasyprint (already in requirements.txt per check) to emit a PDF.
☐ Migration migrations/20260428_hypothesis_score_history.sql if needed: hypothesis_score_history(hypothesis_id, day, composite_score, elo_rating, dim_scores JSONB); backfill driver economics_drivers/ci_hypothesis_score_history_backfill.py.
☐ "Tearsheet" link added to top of every existing /hypothesis/{id} page.
☐ Cache: full HTML rendered server-side and cached for 1 h per hypothesis (key = (id, composite_score, last_evidence_change)); skip cache if ?force_refresh=1.
☐ Tests tests/test_tearsheet.py: well-formed hypothesis (mid-tier composite, both supporting + contradicting cites, 1 prediction) renders all 8 panels; minimal hypothesis (no decision tree, no preregs) renders fallback messages instead of erroring; PDF export emits ≥ 1 page.

Approach

  • Decide on chart strategy upfront: SVG-only for radar + sparkline (no client JS) so tearsheets are share-friendly + printable.
  • Build a single _assemble_tearsheet(db, hypothesis_id) -> dict that runs the 8 queries in parallel via asyncio.gather (≤ 600 ms target).
  • Render via Jinja templates/atlas/hypothesis_tearsheet.html + a print-stylesheet site/tearsheet.css.
  • The PDF path: pass the rendered HTML through weasyprint; ship a one-page output by default.
  • Smoke-render the top 10 hypotheses by Elo and visually inspect.
  • Dependencies

    • q-viz-decision-tree-synthesis — supplies decision-tree panel.
    • q-qual-auto-fact-check-pipeline — fact-check verdicts.
    • q-qual-claim-strength-normalizer — strength badge.
    • q-er-preregistration (shipped) — prediction outcomes.

    Dependents

    • q-impact-hypothesis-uptake — embeds tearsheet as the canonical share URL.

    Work Log

    Payload JSON
    {
      "completion_shas": [
        "139a984ae"
      ],
      "completion_shas_checked_at": ""
    }

    Sibling Tasks in Quest (Live Dashboard Artifact Framework) ↗