[Agora] Auto-trigger debates for low-quality or conflicting artifacts done analysis:5

← Artifact Debates
Trigger rules for auto-debate: low quality + high usage, evidence imbalance, conflicting replication

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (18)

[Exchange] Add usage-based quality signals to propagate_quality [task:agr-ad-03-USAGE]2026-04-25
Squash merge: orchestra/task/agr-ad-0-artifact-quality-profile-dashboard (2 commits)2026-04-25
[Senate] Work log: artifact quality dashboard spec [task:agr-ad-05-PROF]2026-04-25
[Senate] Artifact quality profile dashboard [task:agr-ad-05-PROF]2026-04-25
[Verify] auto-trigger debates implementation verified [task:agr-ad-06-TRIG]2026-04-25
[Agora] Work log: auto-trigger debates implementation complete [task:agr-ad-06-TRIG]2026-04-25
[Agora] Auto-trigger debates for low-quality or conflicting artifacts2026-04-25
Squash merge: orchestra/task/agr-ad-0-artifact-evidence-accumulation-system (1 commits)2026-04-25
[Agora] Artifact evidence accumulation system [task:agr-ad-02-EVAC]2026-04-25
[Docs] Work log: merge gate placeholder fix [task:agr-ad-04-VDEB]2026-04-18
[Agora] Fix debate_sessions INSERT: 12 columns, 12 placeholders [task:agr-ad-04-VDEB]2026-04-18
[Agora] Version-aware debates — target version population + reduced-weight propagation2026-04-18
[Agora] Sync slot file for agr-ad-01-TARG [task:agr-ad-01-TARG]2026-04-15
[Agora] Sync slot file for agr-ad-01-TARG [task:agr-ad-01-TARG]2026-04-15
[Agora] Update spec work log for generalized debate targeting [task:agr-ad-01-TARG]2026-04-15
Squash merge: orchestra/task/agr-ad-0-generalize-debate-targeting-to-any-artif (1 commits)2026-04-15
[Senate] Holistic prioritization run 2: quest fixes + 3 new CI tasks [task:b4c60959-0fe9-4cba-8893-c88013e85104]2026-04-06
[Senate] Holistic prioritization: 6 tasks created for uncovered P88-P95 quests [task:b4c60959-0fe9-4cba-8893-c88013e85104]2026-04-06
Spec File

Goal

Automatically identify artifacts that would benefit from debate and trigger artifact-review
debates. Targets include: artifacts with low quality scores, conflicting evidence,
high usage but no debate history, and newly-extracted experiments with methodology concerns.

Acceptance Criteria

☐ Trigger rules:
- Quality score < 0.4 AND usage count > 3 (widely used but low quality)
- Evidence balance < -0.3 (more contradicting than supporting evidence)
- High usage (top 10%) AND zero debates (never scrutinized)
- Conflicting replication status (experiments)
- Quality score dropped > 0.2 in last 30 days
find_debate_candidates() function returns ranked list of artifacts needing debate
☐ Integration with agent loop: agent checks for debate candidates during idle time
☐ Rate limit: max 3 auto-triggered artifact debates per day (don't flood)
☐ Debate outcomes update artifact quality and evidence profile

Dependencies

  • agr-ad-05-PROF — Quality profile signals inform trigger decisions

Dependents

  • None (leaf task)

Work Log

2026-04-25 — Implementation

What was done:

  • scidex/agora/debate_trigger.py (new file, 455 lines):
  • - find_debate_candidates(limit=20) — ranked list of DebateTrigger objects
    - 5 trigger rules: low_quality_high_usage, evidence_imbalance, high_usage_no_debate, conflicting_replication, quality_drop
    - Deduplicates by artifact_id, sorts by trigger_score descending
    - Rate-limited to MAX_AUTO_DEBATES_PER_DAY=3 via knowledge_gaps count with [AUTO_TRIGGER] tag
    - trigger_artifact_debate() — creates gap with TARGET_ARTIFACT + AUTO_TRIGGER tags
    - _get_debate_count() — counts debates via artifact_debates + debate_sessions

  • scidex/agora/scidex_orchestrator.py (orchestrator integration):
  • - Parses [TARGET_ARTIFACT type=X id=Y] from gap description at start of run_debate
    - Sets target_artifact_type/target_artifact_id on debate_sessions INSERT
    - After debate completes: _update_artifact_debate_outcome() — records in artifact_debates, updates support/contradiction counts, creates artifacts_history entry for quality drops

  • api.py — added GET /api/debate/candidates endpoint returning ranked DebateTrigger list
  • Status: Committed and pushed to orchestra/task/agr-ad-0-auto-trigger-debates-for-low-quality-or (commit 1855f6871)

    Note on data: Rule 2 (evidence_balance < -0.3) returns 0 candidates (no artifacts have 3+ debate entries with negative balance yet). Rule 4 (conflicting replication) returns 0 (no conflicting experiments currently). Rule 1 returns 0 (no artifacts have quality_score < 0.4 AND usage signals). Active candidates: high_usage_no_debate (top-10% usage, 0 debates) for 20 wiki pages.

    Verification — 2026-04-25 19:50 UTC

    What was verified:

  • Code compiles: python3 -m py_compile scidex/agora/debate_trigger.py and scidex_orchestrator.py — both OK
  • Function works: find_debate_candidates(limit=5) returns 5 candidates (high_usage_no_debate rule, wiki pages with top-10% usage and 0 debates)
  • API route exists in api.py at line 18294: @app.get("/api/debate/candidates") with correct docstring
  • Orchestrator integration: parses [TARGET_ARTIFACT type=X id=Y] and _update_artifact_debate_outcome() after debate completes
  • Rebase: confirmed branch is up to date with origin/main (diverged by 2 commits that are the implementation itself)
  • All acceptance criteria addressed by the implementation
  • Commit: 1855f6871 (auto-trigger debates for low-quality or conflicting artifacts)

    Payload JSON
    {
      "requirements": {
        "analysis": 5
      }
    }

    Sibling Tasks in Quest (Artifact Debates) ↗