SciDEX — Task: [Atlas] Spawn open questions from falsified hypoth

When a hypothesis is falsified or a market resolves against consensus, generate successor open_questions seeded with parent Elo+50.

Completion Notes

Auto-release: work already on origin/main

Git Commits (2)

Squash merge: orchestra/task/a4c450f7-biomni-analysis-parity-port-15-use-cases (87 commits) (#717)2026-04-27

[Atlas] Spawn open questions from falsified hypotheses + market misses [task:bbe35802-07b1-4bcc-8b0f-bd0c33a2cf41] (#672)2026-04-27

Spec File

Goal

When a hypothesis prediction is falsified or a prediction market resolves
against the consensus, the most valuable downstream artifact is the new
question that the falsification opens. ("If APOE4 → tau spread is wrong, what
DOES drive tau spread in the absence of APOE4?") Today, falsifications close
the hypothesis loop without spawning any new search frontier. This task wires
a falsification → open_question generator so every refuted prediction becomes
a ranked question for the field.

Acceptance Criteria

☐ New module scidex/agora/open_question_from_falsified.py (≤400 LoC).

☐ Triggers on two event sources:

- hypothesis_predictions.outcome_status flipping to
falsified / unsupported (poll table; no event_bus dependency).
- markets.resolution_status='resolved' with the resolution disagreeing
with the pre-resolution consensus probability by ≥0.4.

☐ For each trigger, calls scidex.core.llm.complete to ask a

domain-expert persona (selected from DOMAIN_JUDGES in
scidex/agora/open_question_tournament.py) to generate 1-3 successor
questions that the falsification specifically motivates. Each must
include a verbatim "what we now know" + "what we still don't know"
contrast.

☐ Emits open_question artifacts with:

- metadata.source_kind='falsified_prediction' or 'market_resolution'
- metadata.parent_hypothesis_id set
- metadata.field_tag inherited from the parent hypothesis
- metadata.importance_elo seeded at the parent hypothesis's prior
Elo + 50 (falsifications point at the most-bet-on questions, so they
deserve a head start in the per-field tournament).

☐ Cross-links via artifact_links: link_type='answered_by' (falsified

hypothesis → new question), and the new question gets a
link_type='succeeds' edge back.

☐ Acceptance backfill: process the existing falsified hypotheses

(SELECT id FROM hypotheses WHERE status='falsified') and emit ≥30
new open_question artifacts; assert in test.

☐ Idempotent: rerunning over the same falsification set creates 0 new

questions (dedup via question_hash).

☐ Pytest harness mocks the LLM and verifies link-graph shape, Elo seeding,

and dedup behavior.

Approach

Inspect hypothesis_predictions schema in scidex/core/database.py and

markets table in exchange.py for resolution columns.

Reuse persona selection logic from

scidex/agora/open_question_tournament.py DOMAIN_JUDGES.

Run as a daily systemd timer (scidex-openq-falsified.timer); single-shot

for backfill.

Write resulting question count + sample to

data/scidex-artifacts/reports/openq_falsified_<utc>.json.

Dependencies

q-openq-mine-from-wiki-pages — shared dedup util
b2d85e76-51f3 — open_question schema
47ee9103-ccc0 — Elo tournament reads seeded importance_elo

Work Log

2026-04-27 — Implementation (task:bbe35802)

Schema findings:

hypothesis_predictions.status (not outcome_status) holds 'falsified'/'unsupported'. 5 falsified rows exist (test data).
hypotheses.status='falsified' — 0 rows currently; backfill handles both paths.
markets uses current_price + resolution_price; consensus fallback via metadata['consensus_probability'].

Deliverables:

scidex/agora/open_question_from_falsified.py (378 LoC):

- process_falsified_prediction / process_market_resolution per-source processors
- run_poll(since_hours) / run_backfill(limit) batch runners
- Idempotency: source_id + source_kind guard prevents re-processing
- Dedup: SimHash question_hash from open_question_miner_wiki
- Elo seeding: parent_hypothesis_elo + 50; answered_by + succeeds links
- DOMAIN_JUDGES persona selection from open_question_tournament

tests/agora/test_open_question_from_falsified.py — 19 tests all pass:

field-tag inference, heuristic stub, dedup, link-graph shape, idempotency,
market divergence gate, backfill≥30 artifacts, Elo boost assertion

deploy/scidex-openq-falsified.{service,timer} — daily at 03:00 UTC

Payload JSON

{
  "completion_shas": [
    "15fa5de4c5dee93403dfb7df8cb37b372880e41f"
  ],
  "completion_shas_checked_at": ""
}

Sibling Tasks in Quest (Open Questions as Ranked Artifacts) ↗

○[Senate] Registry of papers that USED a SciDEX-generated hypothesisP84

○[Atlas] Add pathway diagrams to 20 hypotheses missing mechanism mapsP83

○[Atlas] Link 25 wiki pages missing KG node mappingsP80

○[Atlas] Rank 25 analyses missing world-model impact scoresP80

✓[Atlas/UI] Per-field landing pages — aggregate landscape + gaps + open questions + proposals + experts at /science/P95

✓[Atlas/feat] open_question artifact_type — schema, populate, per-field viewsP94

✓[Atlas/UI] open_question + proposal detail pages — render artifact_type=open_question and four proposal kinds with discussion + provenance tabsP93

[Atlas] Spawn open questions from falsified hypotheses + market-resolved misses done