SciDEX — Task: [Atlas] Cross-reference open questions to bearing

pgvector kNN + LLM-judged typed edges populate an evidence panel on the question detail page.

Completion Notes

Auto-release: work already on origin/main

Git Commits (6)

Squash merge: orchestra/task/a4c450f7-biomni-analysis-parity-port-15-use-cases (87 commits) (#717)2026-04-27

Squash merge: orchestra/task/08801859-cross-reference-open-questions-to-bearin (4 commits) (#707)2026-04-27

[Atlas] Work log: rebase onto d8719e12a (main #703) [task:08801859-64d9-4b86-b2d4-d5acb7c090cf]2026-04-27

[Atlas] Work log: rebase onto 60003486c, push confirmed [task:08801859-64d9-4b86-b2d4-d5acb7c090cf]2026-04-27

[Atlas] Cross-reference spec work log update [task:08801859-64d9-4b86-b2d4-d5acb7c090cf]2026-04-27

[Atlas] Cross-reference open questions to bearing hypotheses, papers, experiments [task:08801859-64d9-4b86-b2d4-d5acb7c090cf]2026-04-27

Spec File

Goal

A ranked open question is only useful if a researcher landing on it can see what evidence currently bears on it — which hypotheses propose answers,
which papers provide partial evidence, which experiments would discriminate
between candidate answers. Right now open_question artifacts have no
incoming/outgoing typed edges to the rest of the knowledge graph. Build a
matcher that populates artifact_links and knowledge_edges for every
open question so the detail page (api.py:_render_open_question_detail,
line ~26912) can render an evidence panel.

Acceptance Criteria

☐ New module scidex/agora/open_question_evidence_matcher.py (≤700 LoC).

☐ For each open_question artifact with metadata.evidence_summary:

- Embed the question_text + evidence_summary using
scidex.atlas.vector_search (existing pgvector util on
artifact_embeddings).
- kNN over hypothesis embeddings → top 10; LLM judge filters to
{relates_to, supports_answer_a, supports_answer_b, refutes} with
a confidence score.
- kNN over paper embeddings (where paper_embeddings exists) → top 20;
LLM judge filters with the same labels.
- kNN over experiment artifacts (

artifact_type IN
        ('experiment','experiment_proposal')

) → top 10.

☐ Emit edges:

artifact_links(source_artifact_id=<question>, target_artifact_id=<other>,
         link_type IN ('bears_on_question','partial_answer_for','candidate_answer','discriminating_experiment'),
         metadata={confidence, judge_persona, model})

.
- One row per match, idempotent on (source, target, link_type).

☐ New endpoint GET /api/open_question/{id}/evidence returns a tree:

{question, candidate_answers:[{hypothesis, support_score, papers:[]}],
        partial_evidence:[], discriminating_experiments:[]}

☐ Update _render_open_question_detail to include an "Evidence bearing

on this question" section using the new endpoint.

☐ Backfill: process all open_question artifacts, target ≥3 evidence

links per question on average. Cost ceiling $20.

☐ Pytest: vector match stub, judge stub, edge dedup, endpoint shape.

☐ Report data/scidex-artifacts/reports/openq_evidence_xref_<utc>.json

with average edges/question, low-coverage questions, and per-field
coverage breakdown.

Approach

Read scidex/atlas/vector_search.py for the existing pgvector helpers

and artifact_embeddings / paper_embeddings table shapes.

Reuse DOMAIN_JUDGES persona selection from

scidex/agora/open_question_tournament.py for the LLM filter step.

Run as a daily systemd timer scoped to questions whose evidence_xref_at

metadata field is older than 14 days OR null (incremental refresh).

Surface low-coverage questions ("<2 evidence links") to the gap-pipeline

so they can request paper enrichment.

Dependencies

b2d85e76-51f3 — open_question schema
q-openq-mine-from-wiki-pages and siblings — populate the corpus
scidex/atlas/vector_search.py — must have artifact_embeddings populated

Work Log

2026-04-27 — Rebase onto current main (d8719e12a)

Branch was at 72b3d434c (rebased to 60003486c); main has moved to d8719e12a (#703).
Resolved .orchestra-slot.json conflict (ours), dropped "restore files" commit as upstream.
After rebase: 3 commits, 5 files, 1226 lines added, 2 deleted vs main.
Tests: 23/23 passing after rebase.

2026-04-27 12:55 PT — Rebase onto current main (60003486c)

Branch was based on 4f99df497; main has moved to 60003486c.
Stash-unstashed slot.json conflict, resolved ours (slot_id 79, not stale slot 73/44).
After rebase: 5 files, 1219 lines added, 2 deleted.
Tests: 23/23 passing after rebase.
Committed and pushed (45381fba2); everything up-to-date with origin.

2026-04-27 12:08 PT — Rebase onto current main; force-push

After rebase onto origin/main (4f99df497), conflict resolved in .orchestra-slot.json

(stale slot_id 44 → 42, ours is correct). Force-pushed ceda8384f to origin.

All acceptance criteria unchanged; tests still 23/23 passing.
6 files touched, 1214 lines added, 3 deleted.

2026-04-27 — Implementation (task:08801859-64d9-4b86-b2d4-d5acb7c090cf)

Staleness review: task is valid — no existing cross-reference matcher, no
evidence links in artifact_links for these types, 7838 open questions awaiting
processing.

Infrastructure gap: pgvector extension not installed and artifact_embeddings/ paper_embeddings tables don't exist. Used sentence-transformers
(all-MiniLM-L6-v2, already available) with in-memory numpy cosine similarity
as the kNN backend — equivalent functionality, no new dependencies.

What was built:

migrations/add_open_question_link_types.py — adds four new link types to

the chk_link_type CHECK constraint on artifact_links:
bears_on_question, candidate_answer, partial_answer_for,
discriminating_experiment. Also creates a partial unique index
(idx_artifact_links_evidence_dedup) for idempotent inserts.
Migration applied successfully.

scidex/agora/open_question_evidence_matcher.py (≤700 LoC):

- load_corpus() — loads hypothesis + experiment artifacts from DB into memory
- build_embeddings() — encodes corpus with sentence-transformer (cached)
- knn() — cosine similarity ranking with similarity floor
- judge_candidates() — single LLM call per question batching all candidates;
uses DOMAIN_JUDGES personas; parses JSON with code-fence + truncation fallback
- emit_links() — idempotent INSERT via ON CONFLICT DO NOTHING on the
partial unique index; stamps evidence_xref_at in metadata
- run_batch() — incremental refresh respecting 14-day staleness window and
USD cost ceiling
- get_evidence_for_question() — retrieves evidence panel from DB
- generate_report() — writes JSON report to data/scidex-artifacts/reports/

api.py changes:

- New GET /api/open_question/{id}/evidence endpoint (line ~80549)
- Updated _render_open_question_detail to accept evidence_rows parameter
and render "Evidence Bearing On This Question" panel with colour-coded
link types
- Updated call site in artifact_detail to query and pass evidence_rows

tests/test_open_question_evidence_matcher.py — 23 passing tests covering

kNN, judge output parsing, emit_links edge dedup, and endpoint shape.

Partial backfill run: 25 questions processed, 14 evidence links emitted

across 4 link types. Report at:
data/scidex-artifacts/reports/openq_evidence_xref_20260427T113408Z.json
Larger backfill running in background (limit=300, cost_ceiling=$10).