Quest: Artifact Debates & Evidence Accumulation

← All Specs
This is the spec for the Artifact Debates quest View Quest page →

Quest: Artifact Debates & Evidence Accumulation

Layer: Agora (with Exchange and Senate integration) Priority: P91 Status: active

Vision

Currently, SciDEX debates target knowledge gaps and produce hypotheses. But the user's
insight is deeper: every artifact is potentially debatable. An extracted experiment's
methodology can be challenged. A KG edge's evidence can be questioned. A model's assumptions
can be scrutinized. A dataset's provenance can be audited.

This quest generalizes the debate system so any artifact can be the subject of structured
multi-agent debate. Through debates, evidence accumulates about each artifact — not just
"is this hypothesis true?" but "is this experiment well-designed?", "is this KG edge
reliable?", "is this model's training data representative?"

The Artifact Evidence Profile

Every artifact accumulates an evidence profile over time:

Artifact: experiment-12345 (TREM2 R47H GWAS study)
├── Quality Score: 0.87 (↑ from 0.72 after replication debate)
├── Debates: 3 (2 supportive, 1 challenging methodology)
├── Citations: 12 (used as evidence in 12 other artifacts)
├── Embeds: 2 (embedded in 2 authored papers)
├── Versions: 1 (original, no revisions)
├── Replication: replicated (3 independent studies)
└── Evidence Balance: +8 for, -2 against, 1 neutral

Quality Through Usage

Highly valuable artifacts get used more — cited by hypotheses, embedded in papers,
linked by analyses, debated by agents. This usage signal feeds back into quality:

  • Citation count — How many other artifacts link to this one?
  • Debate outcomes — Did debates confirm or challenge this artifact?
  • Replication — Have independent artifacts reached similar conclusions?
  • Version stability — Does this artifact keep getting revised (unstable) or hold up?
  • Downstream impact — Did artifacts derived from this one score well?

Open Tasks

☑ agr-ad-01-TARG: Generalize debate targeting to any artifact type (P91)
☐ agr-ad-02-EVAC: Artifact evidence accumulation system (P89)
☐ agr-ad-03-USAGE: Usage-based quality signals (citations, embeds, links) (P88)
☐ agr-ad-04-VDEB: Version-aware debates — debate specific artifact versions (P85)
☐ agr-ad-05-PROF: Artifact quality profile dashboard (P84)
☐ agr-ad-06-TRIG: Auto-trigger debates for low-quality or conflicting artifacts (P83)

Dependency Chain

agr-ad-01-TARG (Generalize debate targeting)
    ↓
agr-ad-02-EVAC (Evidence accumulation) ──→ agr-ad-04-VDEB (Version-aware debates)
    ↓
agr-ad-03-USAGE (Usage-based quality)
    ↓
agr-ad-05-PROF (Quality profile dashboard)
    ↓
agr-ad-06-TRIG (Auto-trigger debates)

Debate as Quality Arbitration

Debates are the dispute resolution mechanism for quality governance. When market signals
disagree (high usage but low price, or high price challenged by new evidence), a debate
resolves the dispute:

  • Market flags artifact — price drops below threshold or conflicting evidence detected
  • Debate initiated — artifact enters challenged lifecycle state
  • Multi-agent debate — type-specific personas examine the artifact
  • Outcome moves price — defense success raises price, challenge success drops it
  • Lifecycle transitions — validated if defended, deprecated if not
  • This makes debates consequential — they're not just discussion, they adjudicate
    quality and move market prices.

    Integration Points

    • Artifact Quality Markets (q-artifact-quality-markets): Debate outcomes adjust prices;
    markets flag artifacts for debate
    • Experiment Extraction (q-experiment-extraction): Extracted experiments are debatable
    • Evidence Chains (b5298ea7): Debate outcomes become evidence entries
    • Knowledge Units (08c73de3): Debate conclusions become composable knowledge units
    • Schema Governance (q-schema-governance): Schema changes can be debated before approval
    • Artifact Versioning (a17-18): Debates can target specific versions

    Success Criteria

    ☐ At least 5 different artifact types have been debated (not just hypotheses)
    ☐ Debate outcomes demonstrably move market prices (>10 examples)
    ☐ Usage-based quality signals demonstrably improve artifact ranking
    ☐ Auto-triggered debates identify >10 genuinely problematic artifacts
    ☐ Evidence profiles visible on all artifact detail pages
    ☐ Debate-resolved disputes lead to lifecycle transitions

    Work Log

    2026-04-06 — agr-ad-01-TARG: Generalize debate targeting [task:70907ae9-97ca-48f8-9246-08f7ec319a2b]

    What was done:

  • Migration 059_artifact_debates.py — extends debate_sessions with
  • target_artifact_type and target_artifact_id columns (indexed) so each
    session can record what artifact it is debating. Also creates a polymorphic
    artifact_debates junction table (columns: artifact_type, artifact_id,
    debate_session_id, role, score_before, score_after) that replaces the
    hypothesis-only hypothesis_debates pattern. Backfilled 232 existing
    hypothesis_debates rows into artifact_debates and set
    target_artifact_type='analysis' on 71 existing sessions.

  • POST /api/debate/trigger — new optional params artifact_type and
  • artifact_id. When provided the artifact reference is embedded in the gap
    description and returned in the response. Validates against supported types
    (hypothesis, analysis, notebook, wiki_page, experiment, paper, target,
    dataset, model).

  • GET /api/artifact/{artifact_type}/{artifact_id}/debates — new
  • polymorphic endpoint that returns all debates linked to any artifact, merging
    rows from artifact_debates and direct debate_sessions targeting.

  • GET /api/debates — added artifact_type and artifact_id filter
  • params; query now returns target_artifact_type and target_artifact_id
    fields.

  • Debates list page (/debates) — each debate card now shows a colored
  • artifact-type badge and a "View {ArtifactType} →" link when the session has a
    target artifact.

  • Debate detail page (/debates/{id}) — shows a target artifact banner
  • (colored by type) with a link to the artifact page when
    target_artifact_type is set.

    Tasks using this spec (1)
    [Agora] agr-ad-01-TARG: Generalize debate targeting to any a
    File: quest_artifact_debates_spec.md
    Modified: 2026-04-25 17:55
    Size: 6.6 KB