123 active hypotheses have empty evidence_against fields. Counter-evidence review is needed so confidence and market prices reflect uncertainty rather than one-sided support.
## Acceptance criteria (recommended — see 'Broader latitude' below)
- 10 active hypotheses gain evidence_against entries or documented no-counterevidence rationale
- Each counter-evidence entry includes PMID, DOI, dataset, debate, or explicit search provenance
- Remaining active hypotheses missing evidence_against is <= 113
## Before starting
1. Read this task's spec file and check for duplicate recent work.
2. Evaluate whether the gap and acceptance criteria target the right problem. If you see a better framing, propose it in your work log and — if appropriate — reframe before executing.
3. Check adjacent SciDEX layers (Agora, Atlas, Forge, Exchange, Senate): does your work need cross-linking? Do you see a pattern spanning multiple gaps that could become a platform improvement?
## Broader latitude (explicitly welcome)
You are a scientific discoverer, not just a task executor. Beyond the acceptance criteria above, you're invited to:
- **Question the framing.** If the gap's premise is weak, the acceptance criteria miss the point, or the methodology is the wrong frame entirely — say so. Propose a reframe with justification.
- **Propose structural improvements.** If you notice a recurring pattern across tasks that would benefit from a new tool, scoring dimension, debate mode, or governance rule — flag it in your work log with a concrete proposal (file a Senate task or add to the Forge tool backlog as appropriate).
- **Propose algorithmic improvements.** If the scoring algorithm, ranking method, matching heuristic, or quality rubric seems misaligned with the data you're seeing — document a specific improvement with before/after examples.
- **Strengthen artifacts beyond the minimum.** Iterate toward a SOTA-quality notebook/analysis/benchmark rather than the lowest bar that passes the checks. Fewer high-quality artifacts beat many shallow ones.
Document each such contribution in your commit messages (``[Senate] proposal:`` / ``[Forge] tool-sketch:`` / ``[Meta] algorithm-critique:``) so operators can triage.
Completion Notes
Released by supervisor slot 40 because credential acquisition failed after pre-claim. Reason: worktree_creation_failed:branch_held_by_other:held_by=/tmp/task-2f7e1600-w5
Add real counter-evidence review to active hypotheses whose evidence_against field is empty. This keeps confidence scores, debates, and markets from becoming one-sided summaries of support only.
Acceptance Criteria
☐ A concrete batch of active hypotheses gains evidence_against entries or documented no-counterevidence rationale
☐ Each entry includes PMID, DOI, dataset, debate, or explicit search provenance
☐ No placeholder or fabricated counter-evidence is added
☐ Before/after missing evidence_against counts are recorded
Approach
Select active hypotheses with empty evidence_against, prioritizing high-confidence or market-relevant rows.
Search linked papers, PubMed, debates, datasets, and KG contradiction edges for real counter-evidence.
Persist concise counter-evidence or an auditable no-counterevidence rationale through the standard PostgreSQL path.
Verify updated rows and inspect a sample for provenance quality.
Dependencies
c488a683-47f - Agora quest
Dependents
Debate quality, confidence calibration, and Exchange pricing
Started staleness review: current main/spec already reduced missing non-archived evidence_against count from 143 to 123 across prior batches, but the live PostgreSQL count remains 123, so the task is still relevant.
Scope for this iteration: skip generic Test hypothesis 0 rows and curate the next 10 highest-scoring real non-archived hypotheses with empty evidence_against.
Searched PubMed-backed metadata and web/PubMed pages for targeted counter-evidence. Paperclip MCP returned 401, so provenance was taken from PubMed IDs, DOI metadata, and explicit search queries rather than Paperclip.
Updated all 10 target hypotheses in PostgreSQL with non-empty evidence_against JSON arrays, evidence_validation_score values of 0.40-0.50, and refreshed last_evidence_update / last_mutated_at.
Key PMIDs/DOIs used: 33347683 / 10.1111/tra.12779 and 38607248 / 10.1002/pro.4980 (VPS35 retromer mechanism cautions); 40934925 / 10.1016/j.molcel.2025.08.022 and 39937915 / 10.1126/sciadv.adp3672 (SNCA-ESCRT alternative mechanisms); 31130847 / 10.3389/fncel.2019.00181 and 35031484 / 10.1016/j.celrep.2021.110220 (APOE4 substrate-specific microglial phagocytosis); 32349774 / 10.1186/s13041-020-00607-3 and 40750607 / 10.1038/s41467-025-62351-3 (CCR2 localization and missing MMP9-specific step); 37679434 / 10.1038/s41591-023-02543-w and 10.1038/s41580-024-00738-8 (D+Q AD feasibility and senescence-marker caution); 41125938 / 10.1038/s42255-025-01386-8 (AMPK activation can inhibit OPC proliferation/differentiation); 32694648 / 10.1038/s42255-019-0085-0 and 32694650 / 10.1038/s42255-019-0086-z (SLC12A8 NMN transporter dispute); 22920254 / 10.1016/j.neuron.2012.06.026 and 35326100 / 10.3390/antiox11030450 (DRP1 mislocalization and Mdivi-1 specificity caution); 29677514 / 10.1016/j.cell.2018.03.004 and 39052476 / 10.1016/j.celrep.2024.114537 (FUS Transportin/NLS and methylation context); 41091915 / 10.1126/scitranslmed.adq3401 and 38402606 / 10.1016/j.neuron.2024.01.024 (KDM6A neuroinflammation and multifactorial AD sex differences).
Verification: before count was 123 non-archived hypotheses with empty evidence_against; after count is 113. A sample query confirmed every target row now has 1-2 evidence items with PMID or DOI provenance and a non-null evidence validation score.
Updated all 10 hypotheses with evidence_against JSON arrays and evidence_validation_score (0.40–0.55) reflecting balance of for/against evidence. Updated last_evidence_update timestamp.
After count: 0 of the 10 target hypotheses have empty evidence_against.
2026-04-21 - Quest engine template
Created reusable spec for quest-engine generated hypothesis counter-evidence tasks.
2026-04-21 18:57 UTC — task cd74e4ec-f04e-49c5-9cc5-0c6cb1fd4e39
Started obsolescence check and read the quest predicate in quest_engine.py.
Initial strict status='active' query returned 0 rows because current hypothesis statuses are proposed, promoted, debated, and archived; the quest predicate uses COALESCE(status, '') <> 'archived'.
Before count using the quest predicate: 10 non-archived hypotheses had empty evidence_against; all 10 were duplicate proposed Test: TREM2 enhances amyloid clearance rows.
Plan: add one reviewed TREM2 counter-evidence bundle to those 10 rows through scidex.core.database.get_db(), using PubMed-backed citations and explicit search provenance.
Counter-evidence bundle cites PMID 29073081 / DOI 10.1073/pnas.1710311114, PMID 37371067 / DOI 10.3390/cells12121597, and PMID 29859094 / DOI 10.1186/s13024-018-0262-8, with PubMed/web search query provenance on each item.
After count using the quest predicate: 0 non-archived hypotheses have empty evidence_against; verification query reported 722/722 non-archived hypotheses with non-empty evidence-against arrays.
System check: scidex status shows PostgreSQL reachable, API/nginx/linkcheck/Neo4j active; agent and bridge were activating, forge-improve inactive.