[Senate] Question emitter - question comments spawn open_question artifacts done

← Percolation Engine
Question-classified comments mint open_question artifacts that enter per-field Elo; title-hash dedup, derives_from link.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (2)

Squash merge: orchestra/task/a4c450f7-biomni-analysis-parity-port-15-use-cases (87 commits) (#717)2026-04-27
[Senate] Question emitter: question comments spawn open_question artifacts [task:4320d55a-47a6-492f-9ad6-096bf1580ac4] (#664)2026-04-27
Spec File

Goal

Close the question arm of percolation: when a comment classified as question is posted on any artifact, the system mints a new open_question artifact (the artifact_type already used by the open-question
quest b307ad54-a95), seeded with the comment text, attributed to the
comment author, and linked back to the host artifact via a typed artifact_link. The new open_question then enters the per-field Elo ranking
that quest already owns. Today every question buried in a discussion
disappears; this emitter promotes them to first-class artifacts that compete
for attention.

Acceptance Criteria

☑ New module scidex/senate/question_emitter.py with
scan_candidates, extract_question_text, emit_open_question,
run_once mirroring the action / refutation emitter shape.
☑ Selection: artifact_comments with
comment_type_labels::jsonb @> '["question"]' AND
spawned_open_question_id IS NULL. No consensus required (questions
need not be agreed-upon to be tracked).
☑ Migration migrations/20260428_question_emitter.sql:
- ALTER TABLE artifact_comments ADD COLUMN IF NOT EXISTS
spawned_open_question_id text
+ partial index.
- CREATE TABLE comment_question_emitter_runs ... mirroring
existing *_emitter_runs tables.
extract_question_text(comment_content) -> str | None: returns the
first sentence ending in ? if the comment text contains one;
otherwise returns the first 280 chars. Pure helper; unit-tested.
emit_open_question(...): creates an artifact via
scidex.atlas.artifact_registry.register_artifact with
artifact_type='open_question',
title=extracted_question_text,
created_by=comment.author_id,
metadata={"source_comment_id":..., "host_artifact_id":...,
"field": host_artifact.metadata.get("field")}
.
Inherits the host's field so the per-field Elo from
scidex/agora/open_question_tournament.py (ENTITY_TYPE='open_question')
picks it up automatically.
☑ Provenance: writes an artifact_provenance row with
action_kind='spawn_proposal' (the closest existing kind in the
check constraint, which already includes 'spawn_proposal').
Reusing it avoids another schema migration; document the choice in
the Work Log.
☑ Link: artifact_link with link_type='derives_from',
source_artifact_id=open_question_id,
target_artifact_id=host_artifact_id, lifecycle='confirmed'.
☑ De-dup: unique partial index on spawned_open_question_id.
Bonus: also de-dup by question text similarity within the same field
using a simple normalized-title hash (lowercase + strip + sha1) before
minting; if a match exists, link to the existing open_question
instead of creating a duplicate.
☑ API: POST /api/senate/question_emitter/run and
GET /api/senate/question_emitter/status.
☑ Tests: extract first-sentence vs no-question-mark, dedup-by-hash,
dry-run no-op, end-to-end emit creates artifact + provenance + link.

Approach

  • Read the open_question wiki backfill task (already done in this quest;
  • the Atlas/feat] Wiki TODO/Open-Question section parser is the pattern
    for creating open_question artifacts) to copy the field-inheritance and
    create-artifact call shape.
  • Build the title-hash dedup helper and unit-test it against synthetic
  • near-duplicates ("Why does X?" vs "why does x?").
  • Implement migration + emitter + routes + tests.
  • Smoke-test by classifying a synthetic question comment and verifying the
  • new open_question lands on the per-field Elo leaderboard.

    Dependencies

    • q-perc-comment-classifier-v1 — supplies comment_type_labels.
    • Open-question quest schema (already shipped: b2d85e76 per the wiki
    parser task description).

    Dependents

    • q-perc-comment-trace-ui — surfaces "your question is now tracked open
    question X, current Elo Y".

    Work Log

    2026-04-27 11:00 UTC — Slot 79 (minimax)

    • Staleness review: Task branch orchestra/task/4320d55a-question-emitter-question-comments-spawn
    created 2026-04-27T10:53. Worktree was clean; rebased against origin/main (e9ab5b9aa).
    Task title + acceptance criteria remain fully valid — no sibling has addressed this yet.
    • Spec read: understood goal (question-classified comments mint open_question artifacts entering per-field Elo).
    • Existing patterns studied: read action_emitter.py, refutation_emitter.py, and
    open_question_miner_wiki.py. Used register_artifact + create_link from
    scidex.atlas.artifact_registry (same pattern as wiki miner). Reused
    spawn_proposal for action_kind (already in CHECK constraint, avoids migration).
    Reused derives_from for link_type (already allowed per chk_link_type).

    • Migration migrations/20260428_question_emitter.sql:
    - Applied ALTER TABLE artifact_comments ADD COLUMN IF NOT EXISTS spawned_open_question_id text
    + partial index idx_ac_spawned_open_question_id.
    - Applied CREATE TABLE comment_question_emitter_runs (...) + index.

    • Module scidex/senate/question_emitter.py: implemented all four functions
    (scan_candidates, extract_question_text, emit_open_question, run_once) plus
    get_audit_stats and CLI. Key design decisions:
    - extract_question_text: returns first ?-ending sentence, else first 280 chars.
    Pure function, no DB access.
    - title_hash: SHA1(normalized_lower) → 12-char hex, used for field-scoped dedup
    before minting.
    - emit_open_question: checks dedup via _find_existing_by_hash before creating.
    Field inherited from host_artifact_metadata.field_tag or metadata.field or
    defaults to "neurodegeneration".
    - No consensus gate (questions don't need agreement to be tracked).

    • API routes in api_routes/senate.py:
    - POST /api/senate/question_emitter/run — delegates to _qe.run_once
    - GET /api/senate/question_emitter/status — delegates to _qe.get_audit_stats

    • Tests tests/test_question_emitter.py: 12 tests covering:
    - extract_question_text: first ? sentence, no ?, long text, empty input
    - title_hash: case insensitivity, whitespace, 12-char hex format
    - emit_open_question: dry_run flag, empty content error
    - scan_candidates: column shape via mock DB

    • All 12 tests pass. Smoke-tested run_once(dry_run=True) against live DB: 1 candidate found.

    Sibling Tasks in Quest (Percolation Engine) ↗