SciDEX — Task: [Senate] Registry of papers that USED a SciDEX-gen

hypothesis_uptake_events table tracking external papers that support/refute SciDEX hypotheses; LLM verdict classifier.

Spec File

Effort: thorough

Goal

The strongest signal of platform impact is when an external paper
tests, refines, or refutes a hypothesis SciDEX generated. This is one
notch deeper than a bare citation: the paper does not just acknowledge
us, it produces evidence about a SciDEX hypothesis, becoming a
real-world test of the platform's epistemic output.

Build a Senate-backed registry that tracks these uptake events
explicitly. Each entry links a hypothesis artifact to a published
paper plus a verdict (supports / refutes / partially_supports / refines) and a Senate-reviewed confidence label.

Acceptance Criteria

☐ New table hypothesis_uptake_events (Postgres):

CREATE TABLE hypothesis_uptake_events (
        id UUID PRIMARY KEY,
        hypothesis_id TEXT NOT NULL REFERENCES hypotheses(id),
        paper_id TEXT REFERENCES papers(paper_id),
        paper_doi TEXT,
        paper_pmid TEXT,
        verdict TEXT NOT NULL CHECK (verdict IN
          ('supports','refutes','partially_supports','refines',
           'cites_only')),
        verdict_confidence FLOAT CHECK
          (verdict_confidence BETWEEN 0 AND 1),
        evidence_quote TEXT,
        detected_by TEXT NOT NULL CHECK (detected_by IN
          ('automated_nlp','citation_tracker','attribution_detector',
           'human_submitted','peer_review_agent')),
        senate_review_status TEXT NOT NULL DEFAULT 'pending'
          CHECK (senate_review_status IN
            ('pending','approved','contested','rejected')),
        senate_reviewer_agent_id TEXT,
        first_detected_at TIMESTAMP DEFAULT NOW(),
        UNIQUE(hypothesis_id, paper_doi)
      );

☐ Detector pipeline at

scidex/senate/hypothesis_uptake_detector.py:
- Cross-references external_citations (where
scidex_artifact_id resolves to a hypothesis artifact
type) and runs an LLM verdict-classification on the
citation_context to fill verdict +
verdict_confidence.
- Pulls preprint attribution rows from
scidex_attributions flagged as methods_acknowledgment
for hypotheses.
- Accepts manual submissions via
POST /api/hypothesis-uptake/submit.

☐ Senate review endpoint POST


      /api/hypothesis-uptake/{id}/review

flips
senate_review_status. Requires Senate role.
- On approved verdict='refutes': the hypothesis's
composite_score automatically drops by 1 point and a
lifecycle_state='refuted_by_external_paper' row is added
(do NOT silently delete the hypothesis — it stays as
evidence of platform calibration).
- On approved verdict='supports': composite_score gains
0.5 points, an external_validation_count field
increments.

☐ Per-hypothesis page (/hypothesis/{id}) gets a new

"External validations" section listing approved uptake
events with verdict color-coding.

☐ New route /dashboard/hypothesis-uptake showing aggregate

stats: how many hypotheses have ≥1 uptake event, by-verdict
breakdown, top-cited hypotheses, refute-vs-support ratio
(platform calibration KPI).

☐ Pytest fixtures: 3 hypotheses with mixed-verdict uptake

events, asserts the score-update math + the dashboard
counts.

☐ Acceptance run: dashboard /dashboard/hypothesis-uptake

shows ≥1 row when seeded against production data.

Approach

Schema migration in migrations/<date>_hypothesis_uptake.sql.

Verdict classifier: small LLM call (use

scidex.routing.llm_router with mode='cheap') over the
citation context — emit JSON

{verdict, confidence,
   evidence_quote}

. Cap context at 1000 tokens so cost stays low.

Senate review UI: extend the

q-impact-preprint-attribution triage page with an extra tab
"Hypothesis-uptake" that lists pending rows with the same
Confirm/Reject pattern.

Score-update hook: do this in a single

scidex/agora/hypothesis_score.py helper so no race with
concurrent rescore jobs.

Dependencies

q-impact-citation-tracker (sibling) — supplies citation rows.
q-impact-preprint-attribution (sibling) — supplies

attribution rows.

scidex.routing.llm_router — for verdict classification.

Work Log

Sibling Tasks in Quest (Open Questions as Ranked Artifacts) ↗

○[Atlas] Add pathway diagrams to 20 hypotheses missing mechanism mapsP83

○[Atlas] Link 25 wiki pages missing KG node mappingsP80

○[Atlas] Rank 25 analyses missing world-model impact scoresP80

✓[Atlas/UI] Per-field landing pages — aggregate landscape + gaps + open questions + proposals + experts at /science/P95

✓[Atlas/feat] open_question artifact_type — schema, populate, per-field viewsP94

✓[Atlas/UI] open_question + proposal detail pages — render artifact_type=open_question and four proposal kinds with discussion + provenance tabsP93