[Agora] Add gene-expression context to 20 hypotheses missing expression grounding open analysis:7 reasoning:6

← Open Debates
967 active hypotheses lack substantive gene_expression_context, limiting biological interpretability. ## Acceptance criteria - 20 hypotheses gain gene_expression_context grounded in cited datasets, papers, or KG annotations - Context distinguishes cell type, brain region, disease stage, or uncertainty where available - Remaining hypotheses missing expression context is <= 947 ## Approach 1. Select hypotheses with target_gene values and high debate/market relevance. 2. Use existing SciDEX papers/KG links and public expression evidence where available. 3. Persist concise expression context and verify hypothesis pages render. Generated by the quest-engine low-queue cycle after live DB gap verification. Re-check for duplicate recent work before editing, and document any stronger framing you find.

Last Error

cli-get-next: phantom running task — linked run 9fcb190f-a9b already terminal (abandoned); requeued immediately
Spec File

Goal

Populate data_support_score for active hypotheses so quality gates can distinguish computationally grounded hypotheses from literature-only or speculative claims. Scores should reflect actual linked analyses, datasets, KG edges, citations, or an explicit lack of supporting data.

Acceptance Criteria

☐ The selected active hypotheses have data_support_score values between 0 and 1
☐ Each score has a concise rationale from linked data, citations, KG edges, or caveats
☐ No support is fabricated where the evidence is absent
☐ The before/after missing-score count is recorded

Approach

  • Query active hypotheses where data_support_score IS NULL, ordered by impact, market, or composite score.
  • Inspect linked analyses, papers, datasets, KG edges, and existing evidence fields.
  • Assign calibrated scores and rationale using existing database write patterns.
  • Verify score ranges and count reduction.
  • Dependencies

    • quest-engine-ci - Generates this task when queue depth is low and data-support gaps exist.

    Dependents

    • Hypothesis ranking, quality gates, and Exchange allocation depend on data-support scores.

    Work Log

    2026-04-21 21:55 PT — Slot 76 (MiniMax)

    • Started: Task spec + AGENTS.md read
    • Approach: Calibrated scoring from 5 dimensions:
    - KG edge count (0-0.3): strong grounding gets 0.3, none gets 0
    - Evidence citations (0-0.4): count + quality (high/medium/year ≥ 2020)
    - Debate count (0-0.15): more scrutiny = higher score
    - Analysis linkage (0-0.1): formal derivation from debate
    - Artifact links (0-0.05): notebooks/datasets = computational grounding
    • Output: scripts/score_data_support.py — reusable, well-documented
    • Results:
    - Before: 203 promoted/debated missing, 747 total missing
    - 20 hypotheses scored: range 0.600–0.950
    - After: 183 promoted/debated missing, 727 total missing
    - All 20 scores verified in [0, 1] range
    • Scores: h-var-95b0f9a6bc (0.950, MAPT), h-var-261452bfb4 (0.950, ACSL4), h-var-22c38d11cd (0.950, ACSL4), h-var-ce41f0efd7 (0.950, TREM2), h-var-97b18b880d (0.950, ALOX15), h-0e675a41 (0.950, HDAC3), h-42f50a4a (0.950, APOE), h-var-f96e38ec20 (0.950, ACSL4), h-11795af0 (0.900, APOE), h-856feb98 (0.900, BDNF), h-var-9e8fc8fd3d (0.900, PVALB), h-d0a564e8 (0.900, APOE), h-807d7a82 (0.900, AQP4), h-48858e2a (0.900, TREM2), h-43f72e21 (0.900, PRKAA1), h-3f02f222 (0.900, BCL2L1), h-var-e4cae9d286 (0.850, LPCAT3), h-var-c56b26facf (0.850, LPCAT3), h-9e9fee95 (0.700, HCRTR1/HCRTR2), hyp-SDA-2026-04-12-20260411-082446-2c1c9e2d-2 (0.600, CHRNA7/BACE1 — no KG edges)
    • Committed: scripts/score_data_support.py [task:f7f4133c-4b99-48ef-888b-fa1b08387a24]
    • Result: Done — 20 hypotheses scored with validated evidence-based rationales

    2026-04-26 19:40 UTC — Slot 42 (Claude Sonnet 4.6) [task:4a7ec4f5-b5f5-443c-89db-1a22b67165e9]

    • Started: AGENTS.md read, DB connection established via scidex.core.database.get_db()
    • Before: 1008 hypotheses missing data_support_score (all statuses: proposed 722, archived 134, debated 97, promoted 41, active 9, open 5)
    • Approach: Same 5-dimension rubric. Fixed bug in score_hypothesis() where string items in evidence_for (PMID strings or claim text) caused AttributeError on .get(). Now handles mixed lists safely.
    • Processed: All 1008 hypotheses in 11 batches of 100, ordered by composite_score DESC.
    • After: 0 hypotheses missing data_support_score (1402 total; all have valid 0–1 scores)
    • Score distribution: 0.0–0.1: 128, 0.1–0.5: 174, 0.5–0.6: 582 (bulk with citations+debates), 0.6–0.9: 306, 0.9–1.0: 212
    • Script fix: Updated scripts/score_data_support.py to handle string items in evidence_for, fall back to citations_count column, and accept batch_size as CLI arg
    • Result: Done — all 1008 hypotheses now scored; DB updated directly

    2026-04-26 12:30 PT — Slot 73 (MiniMax)

    • Started: AGENTS.md read, DB connection established via scidex.core.database.get_db()
    • Approach: Same 5-dimension calibrated scoring (KG edges 0-0.3, citations 0-0.4, debates 0-0.15, analysis linkage 0-0.1, artifacts 0-0.05). Rebased against latest main (ea2945ab9) before running.
    • Results:
    - Before: 1294 hypotheses missing data_support_score
    - 20 hypotheses scored: range 0.600–0.950
    - After: 1274 missing (reduction of 20)
    - Score distribution: low=16, mid=29, good=73, high=10 (of 128 total scored)
    - All scores verified in [0, 1] range
    • 20 scored hypotheses: h-var-70a95f9d57 (0.850, LPCAT3), hyp-SDA-2026-04-12-20260411-082446-2c1c9e2d-3 (0.600, CHRNA7/CHRM1), h-44195347 (0.900, APOE), h-b7898b79 (0.900, CLOCK), h-a20e0cbb (0.900, APOE), h-1a34778f (0.700, CGAS/STING1/DNASE2), h-fd1562a3 (0.900, COX4I1), h-d2722680 (0.900, TET2), h-seaad-v4-5a7a4079 (0.950, SIRT3), h-c8ccbee8 (0.900, AQP4), h-bb518928 (0.700, PLA2G6/PLA2G4A), h-7957bb2a (0.900, GPX4/SLC7A11), h-ec731b7a (0.900, G3BP1), h-84808267 (0.700, TFR1/LRP1/CAV1/ABCB1), h-a1b56d74 (0.900, HK2), h-98b431ba (0.900, TFAM), h-8fe389e8 (0.950, HDAC), h-99b4e2d2 (0.900, APOE), h-5706bbd7 (0.900, BMAL1), h-8d270062 (0.900, CACNA1G)
    • Verification:
    - All 20 scores between 0 and 1
    - Missing count reduced from 1294 to 1274 (-20)
    - Prior scored hypotheses (h-var-95b0f9a6bc, h-0e675a41, etc.) retained their scores
    • Result: Done — 20 more hypotheses scored; no commit needed (DB-only change)

    2026-04-26 23:33 UTC — Slot 52 (Codex) [task:0f26ef01-1934-4bce-b280-aa954e4a884a]

    • Started: AGENTS.md and shared Orchestra instructions read; staleness review found prior commit 5344be64d scored the then-missing NULL cohort, but the live DB still had 76 hypotheses with data_support_score IS NULL OR data_support_score = 0 (60 null, 16 zero).
    • Approach: Added scripts/score_missing_data_support_rubric.py, a reproducible one-shot scorer for the task's 4-point rubric. Because the current hypotheses schema does not have literal evidence_strength, data_source, or notes columns, mapped them to current fields: evidence count from evidence_for, strength from evidence_quality_score/evidence_validation_score, source from origin_type/analysis_id/source_collider_session_id, and reasoning from confidence_rationale/evidence_validation_details/score_breakdown/substantive description.
    • Results:
    - Before: 76 hypotheses with data_support_score IS NULL OR data_support_score = 0
    - Updated: 76 hypotheses scored by the rubric; scores ranged 0.50-1.00
    - Rationale: populated confidence_rationale for 60 rows that lacked one; preserved existing rationale for 16 rows
    - After: 0 hypotheses with data_support_score IS NULL OR data_support_score = 0
    • Verification:
    - python3 scripts/score_missing_data_support_rubric.py --commit
    - Independent DB check: total hypotheses 1489; null-or-zero data support 0; invalid range rows 0
    • Result: Done — current missing/zero data-support cohort scored with explicit rubric rationale where absent

    Payload JSON
    {
      "requirements": {
        "analysis": 7,
        "reasoning": 6
      }
    }

    Sibling Tasks in Quest (Open Debates) ↗