Quest: Artifact Debates & Evidence Accumulation

This is the spec for the Artifact Debates quest View Quest page →

Quest: Artifact Debates & Evidence Accumulation

Layer: Agora (with Exchange and Senate integration) Priority: P91 Status: active

Vision

Currently, SciDEX debates target knowledge gaps and produce hypotheses. But the user's
insight is deeper: every artifact is potentially debatable. An extracted experiment's
methodology can be challenged. A KG edge's evidence can be questioned. A model's assumptions
can be scrutinized. A dataset's provenance can be audited.

This quest generalizes the debate system so any artifact can be the subject of structured
multi-agent debate. Through debates, evidence accumulates about each artifact — not just
"is this hypothesis true?" but "is this experiment well-designed?", "is this KG edge
reliable?", "is this model's training data representative?"

The Artifact Evidence Profile

Every artifact accumulates an evidence profile over time:

Artifact: experiment-12345 (TREM2 R47H GWAS study)
├── Quality Score: 0.87 (↑ from 0.72 after replication debate)
├── Debates: 3 (2 supportive, 1 challenging methodology)
├── Citations: 12 (used as evidence in 12 other artifacts)
├── Embeds: 2 (embedded in 2 authored papers)
├── Versions: 1 (original, no revisions)
├── Replication: replicated (3 independent studies)
└── Evidence Balance: +8 for, -2 against, 1 neutral

Quality Through Usage

Highly valuable artifacts get used more — cited by hypotheses, embedded in papers,
linked by analyses, debated by agents. This usage signal feeds back into quality:

Citation count — How many other artifacts link to this one?
Debate outcomes — Did debates confirm or challenge this artifact?
Replication — Have independent artifacts reached similar conclusions?
Version stability — Does this artifact keep getting revised (unstable) or hold up?
Downstream impact — Did artifacts derived from this one score well?

Open Tasks

☑ agr-ad-01-TARG: Generalize debate targeting to any artifact type (P91)

☐ agr-ad-02-EVAC: Artifact evidence accumulation system (P89)

☐ agr-ad-03-USAGE: Usage-based quality signals (citations, embeds, links) (P88)

☐ agr-ad-04-VDEB: Version-aware debates — debate specific artifact versions (P85)

☐ agr-ad-05-PROF: Artifact quality profile dashboard (P84)

☐ agr-ad-06-TRIG: Auto-trigger debates for low-quality or conflicting artifacts (P83)

Dependency Chain

agr-ad-01-TARG (Generalize debate targeting)
    ↓
agr-ad-02-EVAC (Evidence accumulation) ──→ agr-ad-04-VDEB (Version-aware debates)
    ↓
agr-ad-03-USAGE (Usage-based quality)
    ↓
agr-ad-05-PROF (Quality profile dashboard)
    ↓
agr-ad-06-TRIG (Auto-trigger debates)

Debate as Quality Arbitration

Debates are the dispute resolution mechanism for quality governance. When market signals
disagree (high usage but low price, or high price challenged by new evidence), a debate
resolves the dispute:

Market flags artifact — price drops below threshold or conflicting evidence detected

Debate initiated — artifact enters challenged lifecycle state

Multi-agent debate — type-specific personas examine the artifact

Outcome moves price — defense success raises price, challenge success drops it

Lifecycle transitions — validated if defended, deprecated if not

This makes debates consequential — they're not just discussion, they adjudicate
quality and move market prices.

Integration Points

Artifact Quality Markets (q-artifact-quality-markets): Debate outcomes adjust prices;

markets flag artifacts for debate

Experiment Extraction (q-experiment-extraction): Extracted experiments are debatable
Evidence Chains (b5298ea7): Debate outcomes become evidence entries
Knowledge Units (08c73de3): Debate conclusions become composable knowledge units
Schema Governance (q-schema-governance): Schema changes can be debated before approval
Artifact Versioning (a17-18): Debates can target specific versions

Success Criteria

☐ At least 5 different artifact types have been debated (not just hypotheses)

☐ Debate outcomes demonstrably move market prices (>10 examples)

☐ Usage-based quality signals demonstrably improve artifact ranking

☐ Auto-triggered debates identify >10 genuinely problematic artifacts

☐ Evidence profiles visible on all artifact detail pages

☐ Debate-resolved disputes lead to lifecycle transitions

Work Log

2026-04-06 — agr-ad-01-TARG: Generalize debate targeting [task:70907ae9-97ca-48f8-9246-08f7ec319a2b]

What was done:

Migration 059_artifact_debates.py — extends debate_sessions with

target_artifact_type and target_artifact_id columns (indexed) so each
session can record what artifact it is debating. Also creates a polymorphic
artifact_debates junction table (columns: artifact_type, artifact_id,
debate_session_id, role, score_before, score_after) that replaces the
hypothesis-only hypothesis_debates pattern. Backfilled 232 existing
hypothesis_debates rows into artifact_debates and set
target_artifact_type='analysis' on 71 existing sessions.

POST /api/debate/trigger — new optional params artifact_type and

artifact_id. When provided the artifact reference is embedded in the gap
description and returned in the response. Validates against supported types
(hypothesis, analysis, notebook, wiki_page, experiment, paper, target,
dataset, model).

GET /api/artifact/{artifact_type}/{artifact_id}/debates — new

polymorphic endpoint that returns all debates linked to any artifact, merging
rows from artifact_debates and direct debate_sessions targeting.

GET /api/debates — added artifact_type and artifact_id filter

params; query now returns target_artifact_type and target_artifact_id
fields.

Debates list page (/debates) — each debate card now shows a colored

artifact-type badge and a "View {ArtifactType} →" link when the session has a
target artifact.

Debate detail page (/debates/{id}) — shows a target artifact banner

(colored by type) with a link to the artifact page when
target_artifact_type is set.

Tasks using this spec (1)

[Agora] agr-ad-01-TARG: Generalize debate targeting to any a

Artifact Debates done P91

File: quest_artifact_debates_spec.md

Modified: 2026-04-25 17:55

Size: 6.6 KB