Multi-persona debates produce structured Skeptic critiques and Synthesizer
summaries that are dense with phrases like "we cannot resolve X without Y" or
"open question Z would discriminate between H1 and H2". Today these die in
the debate transcript with no first-class artifact representation. Mine debate
sessions for residual open questions so they land in the Q-OPENQ leaderboards
and become discussable, rankable artifacts cross-linked to the debate that
spawned them.
scidex/agora/open_question_miner_debates.py (≤500 LoC).debate_sessions and analysis_sessions PostgreSQL tables;synthesizer_output IS NOT NULL andmined_open_questions_at IS NULL (new column added by this task).
migrations/q_openq_debate_mined_marker.sql addsmined_open_questions_at TIMESTAMPTZ NULL to bothdebate_sessions and analysis_sessions plus a partial indexWHERE mined_open_questions_at IS NULL.
{question_text, field_tag,scidex.core.llm.complete (cheap tier, JSON-mode).
open_question artifact:metadata.source_kind='debate_session',metadata.source_id=<session_id>;artifact_links row with link_type='derived_from' pointinglink_type='counter_evidence_for' toward the hypothesis the debatefield_tag from the parent hypothesis when LLM is unsuremined_open_questions_at = NOW() is set soquestion_hash produced byq-openq-mine-from-wiki-pages.scidex/agora/synthesis_engine.py and the debate-session schema inscidex/core/database.py to understand session structure and wheresynthesizer_output lives (jsonb).
scidex/agora/extraction_quality.py for thescidex.agora.open_question_miner_wiki (shared util inscidex/agora/_question_dedup.py).
scripts/backfill_openq_from_debates.pydata/scidex-artifacts/reports/.q-openq-mine-from-wiki-pages — provides _question_dedup.py shared utilb2d85e76-51f3 — open_question artifact schemaFiles created:
migrations/q_openq_debate_mined_marker.sql — adds mined_open_questions_at TIMESTAMPTZ NULL to debate_sessions and analyses (spec called it analysis_sessions; actual table is analyses); partial indexes on both; migration applied live.scidex/agora/open_question_miner_debates.py — 300 LoC miner; imports question_hash, _load_existing_hashes, _is_near_duplicate from open_question_miner_wiki; extracts synthesizer turn and top-3 skeptic turns from debate_sessions.transcript_json; passes to LLM (scidex.core.llm.complete) in JSON-mode; registers open_question artifacts via artifact_registry.register_artifact; creates derived_from link to debate artifact and counter_evidence_for link to hypothesis; stamps mined_open_questions_at = NOW() for idempotency; CLI: python -m scidex.agora.open_question_miner_debates --batch 500.tests/test_open_question_miner_debates.py — 23 tests; all passing; covers: empty synthesizer (no-op + marks mined), idempotent rerun (already-mined skipped), dedup via exact question_hash, cross-link insertion, transcript parsing helpers, heuristic fallback.analysis_sessions table does not exist; the actual table is analyses. Migration targets analyses instead.synthesizer_output is not a standalone column; it lives inside transcript_json as the round where persona == 'synthesizer'. The miner extracts it with _extract_synthesizer_output(transcript)._question_dedup.py shared util was not created (wiki miner's helpers imported directly to avoid churn on existing code).