[Senate] Registry of papers that USED a SciDEX-generated hypothesis open

← Open Questions as Ranked Artifacts
hypothesis_uptake_events table tracking external papers that support/refute SciDEX hypotheses; LLM verdict classifier.
Spec File

Effort: thorough

Goal

The strongest signal of platform impact is when an external paper
tests, refines, or refutes a hypothesis SciDEX generated. This is one
notch deeper than a bare citation: the paper does not just acknowledge
us, it produces evidence about a SciDEX hypothesis, becoming a
real-world test of the platform's epistemic output.

Build a Senate-backed registry that tracks these uptake events
explicitly. Each entry links a hypothesis artifact to a published
paper plus a verdict (supports / refutes / partially_supports / refines) and a Senate-reviewed confidence label.

Acceptance Criteria

☐ New table hypothesis_uptake_events (Postgres):

CREATE TABLE hypothesis_uptake_events (
        id UUID PRIMARY KEY,
        hypothesis_id TEXT NOT NULL REFERENCES hypotheses(id),
        paper_id TEXT REFERENCES papers(paper_id),
        paper_doi TEXT,
        paper_pmid TEXT,
        verdict TEXT NOT NULL CHECK (verdict IN
          ('supports','refutes','partially_supports','refines',
           'cites_only')),
        verdict_confidence FLOAT CHECK
          (verdict_confidence BETWEEN 0 AND 1),
        evidence_quote TEXT,
        detected_by TEXT NOT NULL CHECK (detected_by IN
          ('automated_nlp','citation_tracker','attribution_detector',
           'human_submitted','peer_review_agent')),
        senate_review_status TEXT NOT NULL DEFAULT 'pending'
          CHECK (senate_review_status IN
            ('pending','approved','contested','rejected')),
        senate_reviewer_agent_id TEXT,
        first_detected_at TIMESTAMP DEFAULT NOW(),
        UNIQUE(hypothesis_id, paper_doi)
      );

☐ Detector pipeline at
scidex/senate/hypothesis_uptake_detector.py:
- Cross-references external_citations (where
scidex_artifact_id resolves to a hypothesis artifact
type) and runs an LLM verdict-classification on the
citation_context to fill verdict +
verdict_confidence.
- Pulls preprint attribution rows from
scidex_attributions flagged as methods_acknowledgment
for hypotheses.
- Accepts manual submissions via
POST /api/hypothesis-uptake/submit.
☐ Senate review endpoint POST
/api/hypothesis-uptake/{id}/review flips
senate_review_status. Requires Senate role.
- On approved verdict='refutes': the hypothesis's
composite_score automatically drops by 1 point and a
lifecycle_state='refuted_by_external_paper' row is added
(do NOT silently delete the hypothesis — it stays as
evidence of platform calibration).
- On approved verdict='supports': composite_score gains
0.5 points, an external_validation_count field
increments.
☐ Per-hypothesis page (/hypothesis/{id}) gets a new
"External validations" section listing approved uptake
events with verdict color-coding.
☐ New route /dashboard/hypothesis-uptake showing aggregate
stats: how many hypotheses have ≥1 uptake event, by-verdict
breakdown, top-cited hypotheses, refute-vs-support ratio
(platform calibration KPI).
☐ Pytest fixtures: 3 hypotheses with mixed-verdict uptake
events, asserts the score-update math + the dashboard
counts.
☐ Acceptance run: dashboard /dashboard/hypothesis-uptake
shows ≥1 row when seeded against production data.

Approach

  • Schema migration in migrations/<date>_hypothesis_uptake.sql.
  • Verdict classifier: small LLM call (use
  • scidex.routing.llm_router with mode='cheap') over the
    citation context — emit JSON {verdict, confidence,
    evidence_quote}
    . Cap context at 1000 tokens so cost stays low.
  • Senate review UI: extend the
  • q-impact-preprint-attribution triage page with an extra tab
    "Hypothesis-uptake" that lists pending rows with the same
    Confirm/Reject pattern.
  • Score-update hook: do this in a single
  • scidex/agora/hypothesis_score.py helper so no race with
    concurrent rescore jobs.

    Dependencies

    • q-impact-citation-tracker (sibling) — supplies citation rows.
    • q-impact-preprint-attribution (sibling) — supplies
    attribution rows.
    • scidex.routing.llm_router — for verdict classification.

    Work Log

    Sibling Tasks in Quest (Open Questions as Ranked Artifacts) ↗