SciDEX — Task: [Atlas] Per-hypothesis explainability tear-sheet -

One-page printable hypothesis dashboard: radar, score sparkline, top supporting/contradicting cites, decision tree, prereg outcomes, 30d delta.

Completion Notes

Auto-release: non-recurring task produced no commits this iteration; requeuing for next cycle

Git Commits (1)

[Atlas] Per-hypothesis explainability tear-sheet [task:7ead4f85-bf99-4a7c-9498-15fa742ebddd]2026-04-27

Spec File

Goal

A hypothesis page today is a wall of text: title, claim, predictions,
PMIDs, scores. There's no single page that shows why a hypothesis
has its current Elo and composite score in visual form. Build a
one-page "tear-sheet" (printable, shareable) that crunches the
hypothesis's complete evidence + debate + score history into a
dashboard: small-multiples for evidence quality, a score-evolution
spark, top-3 supporting and top-3 contradicting citations, the
Synthesizer's decision-tree summary, the per-dimension score radar,
and a "what changed in last 30d" diff. This is the artifact a
researcher prints to make a go/no-go decision.

Effort: thorough

Acceptance Criteria

☐ GET /hypothesis/{id}/tearsheet renders a self-contained HTML page (no SPA shell) with these panels, all server-rendered:

- Header: title, slug, current composite_score, current elo_rating (from elo_ratings arena='global'), strength badge (from q-qual-claim-strength-normalizer), "last reviewed" timestamp.
- 10-dim radar chart (uses existing per-dim scores; SVG, no JS dependency).
- Score-evolution sparkline: composite_score over time from hypothesis_score_history (create the table if it doesn't exist; backfill from audit_log rows).
- Top-3 supporting citations (rank by citation_validity.support_verdict='supports' × evidence_quality_tier); each row shows PMID, journal, year, support quote, fact-check verdict from q-qual-auto-fact-check-pipeline.
- Top-3 contradicting citations (same shape, verdict='contradicts').
- Decision-tree summary panel from q-viz-decision-tree-synthesis (top 5 branches by weight).
- Per-prediction outcome table (from preregistration_outcomes): predicted_probability, observed_outcome, calibration delta.
- "Last 30 days" delta block: net change in composite, Elo, evidence count, market price (if a market exists).
- Footer: "generated at <ts>", static URL for sharing, print-stylesheet.

☐ Optional ?format=pdf query param uses weasyprint (already in requirements.txt per check) to emit a PDF.

☐ Migration migrations/20260428_hypothesis_score_history.sql if needed: hypothesis_score_history(hypothesis_id, day, composite_score, elo_rating, dim_scores JSONB); backfill driver economics_drivers/ci_hypothesis_score_history_backfill.py.

☐ "Tearsheet" link added to top of every existing /hypothesis/{id} page.

☐ Cache: full HTML rendered server-side and cached for 1 h per hypothesis (key = (id, composite_score, last_evidence_change)); skip cache if ?force_refresh=1.

☐ Tests tests/test_tearsheet.py: well-formed hypothesis (mid-tier composite, both supporting + contradicting cites, 1 prediction) renders all 8 panels; minimal hypothesis (no decision tree, no preregs) renders fallback messages instead of erroring; PDF export emits ≥ 1 page.

Approach

Decide on chart strategy upfront: SVG-only for radar + sparkline (no client JS) so tearsheets are share-friendly + printable.

Build a single _assemble_tearsheet(db, hypothesis_id) -> dict that runs the 8 queries in parallel via asyncio.gather (≤ 600 ms target).

Render via Jinja templates/atlas/hypothesis_tearsheet.html + a print-stylesheet site/tearsheet.css.

The PDF path: pass the rendered HTML through weasyprint; ship a one-page output by default.

Smoke-render the top 10 hypotheses by Elo and visually inspect.

Dependencies

q-viz-decision-tree-synthesis — supplies decision-tree panel.
q-qual-auto-fact-check-pipeline — fact-check verdicts.
q-qual-claim-strength-normalizer — strength badge.
q-er-preregistration (shipped) — prediction outcomes.

Dependents

q-impact-hypothesis-uptake — embeds tearsheet as the canonical share URL.

Work Log

Payload JSON

{
  "completion_shas": [
    "139a984ae"
  ],
  "completion_shas_checked_at": ""
}

Sibling Tasks in Quest (Live Dashboard Artifact Framework) ↗

●[Agora] Live 'newest debates this week' dashboard with Synthesizer-verdict tagsP86

○[Atlas] Per-field time-series dashboard for Elo, citations, marketsP89

○[Exchange] Interactive what-if explorer - change evidence, see Elo moveP86

○[Atlas] Research-front velocity meter for sub-fieldsP86

✓[Atlas/feat] Live dashboard view_spec_json + safe-query DSL + render pathP92

✓[Atlas/feat] Dashboard snapshot — POST /api/dashboard/{id}/snapshot creates immutable child artifactP92

✓[Atlas/UI] /dashboards index page — list all dashboard artifacts with what-it-shows + last_rendered_at + snapshot countP91

✓[Atlas/feat] Seed first-class live dashboards — top-hypotheses, open-questions-by-field, gap-flow, debate-leaderboardP90

✓[Atlas] Live AD therapeutic-landscape dashboard (mechanisms x phase x sponsor)P90

✓[Atlas] Embeddable dashboards in wiki pages via fenced dashboard blockP89

[Atlas] Per-hypothesis explainability tear-sheet - one-page visual done