Today a researcher who finds a new paper that contradicts hypothesis X
has no way to ask "if I attach this contradicting paper to X, how
much would X's Elo and composite drop?" They have to wait for the
debate / score recalculation to run. Build an interactive what-if
explorer that lets users (humans + agents) toggle evidence on/off,
add hypothetical new citations, change persona stances, and see the
predicted Elo and composite score in real time, without
committing the change.
Effort: thorough
scidex/exchange/what_if.py::simulate(hypothesis_id, mutations: list[Mutation]) -> SimResult where Mutation ∈ {AddEvidence(pmid, stance), RemoveEvidence(pmid), FlipPersonaStance(persona, new_stance), OverrideDimensionScore(dim, new_score)}.SimResult returns {new_composite, delta_composite, new_elo_predicted, delta_elo, contributing_factors: [{factor, contribution_pct}], confidence_band: [low, high], simulated_at}.recalibrate_scores.py formula but with the mutated evidence list — pure function, no DB writes.scidex/exchange/elo_ratings.py Bradley-Terry probability against a synthetic opponent at the cohort median rating; returns delta = predicted_log_odds_after - log_odds_before mapped back to Elo points.what_if_simulations(id, hypothesis_id, mutations_json, result_json, simulated_by, simulated_at) audit log table for usage analytics + abuse detection./hypothesis/{id}/what-if shows the current evidence list with toggles next to each PMID, an "add hypothetical PMID" form, and a real-time score panel that updates on every change (debounce 250 ms).POST /api/exchange/what-if/simulate accepts {hypothesis_id, mutations: [...]} and returns SimResult. Rate-limit 60 req/min per user (defense against scraping).evidence_change_proposal for community vote, attaching the simulated delta as motivation.tests/test_what_if.py: removing top-supporting evidence drops composite ≥ 5 %; adding 3 strong contradicting cites flips strength to disputed; flipping Skeptic stance from oppose→support raises composite; mutations pure (DB unchanged).delta_elo is within 1 σ of zero, badge it as "no significant predicted change".compute_composite(hypothesis_dict, evidence_list, dim_scores) -> float so simulation can call it on a mutated copy without touching the DB.scidex/senate/governance.py::create_proposal().scidex/exchange/elo_ratings.py — Bradley-Terry math.recalibrate_scores.py — composite formula.q-impact-claim-attribution — uses what-if simulations to attribute score deltas to contributors.