[Agora] Add data-support scores to active hypotheses

← All Specs

Goal

Populate data_support_score for active hypotheses so quality gates can distinguish computationally grounded hypotheses from literature-only or speculative claims. Scores should reflect actual linked analyses, datasets, KG edges, citations, or an explicit lack of supporting data.

Acceptance Criteria

☐ The selected active hypotheses have data_support_score values between 0 and 1
☐ Each score has a concise rationale from linked data, citations, KG edges, or caveats
☐ No support is fabricated where the evidence is absent
☐ The before/after missing-score count is recorded

Approach

  • Query active hypotheses where data_support_score IS NULL, ordered by impact, market, or composite score.
  • Inspect linked analyses, papers, datasets, KG edges, and existing evidence fields.
  • Assign calibrated scores and rationale using existing database write patterns.
  • Verify score ranges and count reduction.
  • Dependencies

    • quest-engine-ci - Generates this task when queue depth is low and data-support gaps exist.

    Dependents

    • Hypothesis ranking, quality gates, and Exchange allocation depend on data-support scores.

    Work Log

    2026-04-21 21:55 PT — Slot 76 (MiniMax)

    • Started: Task spec + AGENTS.md read
    • Approach: Calibrated scoring from 5 dimensions:
    - KG edge count (0-0.3): strong grounding gets 0.3, none gets 0
    - Evidence citations (0-0.4): count + quality (high/medium/year ≥ 2020)
    - Debate count (0-0.15): more scrutiny = higher score
    - Analysis linkage (0-0.1): formal derivation from debate
    - Artifact links (0-0.05): notebooks/datasets = computational grounding
    • Output: scripts/score_data_support.py — reusable, well-documented
    • Results:
    - Before: 203 promoted/debated missing, 747 total missing
    - 20 hypotheses scored: range 0.600–0.950
    - After: 183 promoted/debated missing, 727 total missing
    - All 20 scores verified in [0, 1] range
    • Scores: h-var-95b0f9a6bc (0.950, MAPT), h-var-261452bfb4 (0.950, ACSL4), h-var-22c38d11cd (0.950, ACSL4), h-var-ce41f0efd7 (0.950, TREM2), h-var-97b18b880d (0.950, ALOX15), h-0e675a41 (0.950, HDAC3), h-42f50a4a (0.950, APOE), h-var-f96e38ec20 (0.950, ACSL4), h-11795af0 (0.900, APOE), h-856feb98 (0.900, BDNF), h-var-9e8fc8fd3d (0.900, PVALB), h-d0a564e8 (0.900, APOE), h-807d7a82 (0.900, AQP4), h-48858e2a (0.900, TREM2), h-43f72e21 (0.900, PRKAA1), h-3f02f222 (0.900, BCL2L1), h-var-e4cae9d286 (0.850, LPCAT3), h-var-c56b26facf (0.850, LPCAT3), h-9e9fee95 (0.700, HCRTR1/HCRTR2), hyp-SDA-2026-04-12-20260411-082446-2c1c9e2d-2 (0.600, CHRNA7/BACE1 — no KG edges)
    • Committed: scripts/score_data_support.py [task:f7f4133c-4b99-48ef-888b-fa1b08387a24]
    • Result: Done — 20 hypotheses scored with validated evidence-based rationales

    2026-04-26 12:30 PT — Slot 73 (MiniMax)

    • Started: AGENTS.md read, DB connection established via scidex.core.database.get_db()
    • Approach: Same 5-dimension calibrated scoring (KG edges 0-0.3, citations 0-0.4, debates 0-0.15, analysis linkage 0-0.1, artifacts 0-0.05). Rebased against latest main (ea2945ab9) before running.
    • Results:
    - Before: 1294 hypotheses missing data_support_score
    - 20 hypotheses scored: range 0.600–0.950
    - After: 1274 missing (reduction of 20)
    - Score distribution: low=16, mid=29, good=73, high=10 (of 128 total scored)
    - All scores verified in [0, 1] range
    • 20 scored hypotheses: h-var-70a95f9d57 (0.850, LPCAT3), hyp-SDA-2026-04-12-20260411-082446-2c1c9e2d-3 (0.600, CHRNA7/CHRM1), h-44195347 (0.900, APOE), h-b7898b79 (0.900, CLOCK), h-a20e0cbb (0.900, APOE), h-1a34778f (0.700, CGAS/STING1/DNASE2), h-fd1562a3 (0.900, COX4I1), h-d2722680 (0.900, TET2), h-seaad-v4-5a7a4079 (0.950, SIRT3), h-c8ccbee8 (0.900, AQP4), h-bb518928 (0.700, PLA2G6/PLA2G4A), h-7957bb2a (0.900, GPX4/SLC7A11), h-ec731b7a (0.900, G3BP1), h-84808267 (0.700, TFR1/LRP1/CAV1/ABCB1), h-a1b56d74 (0.900, HK2), h-98b431ba (0.900, TFAM), h-8fe389e8 (0.950, HDAC), h-99b4e2d2 (0.900, APOE), h-5706bbd7 (0.900, BMAL1), h-8d270062 (0.900, CACNA1G)
    • Verification:
    - All 20 scores between 0 and 1
    - Missing count reduced from 1294 to 1274 (-20)
    - Prior scored hypotheses (h-var-95b0f9a6bc, h-0e675a41, etc.) retained their scores
    • Result: Done — 20 more hypotheses scored; no commit needed (DB-only change)

    Tasks using this spec (7)
    [Agora] Add data-support scores to 20 active hypotheses
    Agora done P86
    [Agora] Add data-support scores to 20 active hypotheses
    Agora done P86
    [Agora] Add data-support scores to 20 active hypotheses
    Agora done P86
    [Agora] Score 20 unscored hypotheses with composite scoring
    Agora done P88
    [Agora] Add data-support scores to 20 active hypotheses
    Agora done P86
    [Agora] Add data-support scores to 20 active hypotheses
    Agora open P87
    [Agora] Compute data-support scores for 20 active hypotheses
    Agora done P86
    File: quest_engine_hypothesis_data_support_scoring_spec.md
    Modified: 2026-04-26 05:25
    Size: 4.4 KB