[Arenas] Fix Elo stall: promote proposed→active hypotheses and run matches done coding:7 reasoning:6

← Mission Control
The Elo tournament has been stalled for 74h+ (last match 2026-04-10T17:00 UTC). Root cause: 196 hypotheses stuck in 'proposed' status, 0 in 'active'. The daily King of the Hill task (607558a9) is not resolving this. Fix: (1) Audit hypothesis status lifecycle — find why proposed→active transition isn't happening. (2) Bulk-promote eligible proposed hypotheses to 'active' via DB or API. (3) Trigger at least one round of Elo matches via /api/arenas/duel/judge. (4) Verify elo_matches table gets new entries. ## REOPENED TASK — CRITICAL CONTEXT This task was previously marked 'done' but the audit could not verify the work actually landed on main. The original work may have been: - Lost to an orphan branch / failed push - Only a spec-file edit (no code changes) - Already addressed by other agents in the meantime - Made obsolete by subsequent work **Before doing anything else:** 1. **Re-evaluate the task in light of CURRENT main state.** Read the spec and the relevant files on origin/main NOW. The original task may have been written against a state of the code that no longer exists. 2. **Verify the task still advances SciDEX's aims.** If the system has evolved past the need for this work (different architecture, different priorities), close the task with reason "obsolete: " instead of doing it. 3. **Check if it's already done.** Run `git log --grep=''` and read the related commits. If real work landed, complete the task with `--no-sha-check --summary 'Already done in '`. 4. **Make sure your changes don't regress recent functionality.** Many agents have been working on this codebase. Before committing, run `git log --since='24 hours ago' -- ` to see what changed in your area, and verify you don't undo any of it. 5. **Stay scoped.** Only do what this specific task asks for. Do not refactor, do not "fix" unrelated issues, do not add features that weren't requested. Scope creep at this point is regression risk. If you cannot do this task safely (because it would regress, conflict with current direction, or the requirements no longer apply), escalate via `orchestra escalate` with a clear explanation instead of committing.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (2)

[Arenas] Sync slot file after resume [task:583f7a67-6f48-40bf-999a-df0827f5cde8]2026-04-14
[Arenas] Fix Elo stall: resume stuck KOTH tournaments, +64 elo_matches [task:583f7a67-6f48-40bf-999a-df0827f5cde8]2026-04-12
Spec File

[Arenas] Fix Elo stall: promote proposed→active hypotheses and run matches

ID: 583f7a67-6f48-40bf-999a-df0827f5cde8 Priority: high Type: one_shot

Goal

The Elo tournament stalled after 2026-04-10T17:00 UTC (74h+). Fix the stall and verify
new elo_matches entries are produced.

Root Cause (Audited)

Task description said "196 hypotheses in 'proposed' status, 0 in 'active'" — but the
hypothesis lifecycle uses proposed, debated, promoted, archived (not active).
The tournament system never filters by status; it selects by composite_score > 0.

The actual root cause: two KOTH tournaments for 2026-04-14 got stuck in_progress
at round 1 on 2026-04-10 when the alzheimers one completed but the neurodegeneration
and neuroscience ones ran out of time / the CI agent stopped:

  • t-dfcc30dbf111 KOTH-neurodegeneration-2026-04-14: in_progress, 10 pending matches, 0 elo_matches
  • t-b630cdd59c70 KOTH-neuroscience-2026-04-14: in_progress, 7 pending matches, 0 elo_matches

The daily CI task 607558a9 has resume logic for stuck in_progress tournaments, but
it hadn't been claimed since 2026-04-10.

Approach

  • Run ci_daily_tournament.py for the two stuck domains (neurodegeneration, neuroscience).
  • The idempotency logic sees today's 2026-04-12 tournaments are complete, finds the
    stuck in_progress 2026-04-14 tournaments, and calls judge_pending_round on them.
  • Verify new elo_matches entries appear in the DB.
  • Also run alzheimers domain to create/seed 2026-04-13 tournament if needed.
  • Acceptance Criteria

    ☑ Root cause audited — hypothesis status (proposed/active) is NOT the blocker
    t-dfcc30dbf111 (neurodegeneration-2026-04-14) completed all 4 rounds (40 matches)
    t-b630cdd59c70 (neuroscience-2026-04-14) completed all 4 rounds (24 matches)
    elo_matches table has 64 new entries on 2026-04-12 (total 876, last: 2026-04-12T16:00)
    ☑ Work log updated

    Work Log

    2026-04-12 UTC — Worktree task-25702d84

    Audit findings:

    • Hypothesis statuses: 196 proposed, 152 debated, 1 archived — no 'active' status exists in hypotheses table
    • 77 proposed hypotheses already have elo_ratings; status doesn't block matching
    • elo_matches last entry before fix: 2026-04-10T17:00:22 (812 total)
    • Two in_progress tournaments found with 0 actual matches: t-dfcc30dbf111 (10 pending), t-b630cdd59c70 (7 pending)
    • Bedrock connectivity confirmed working
    • CI script has stuck-tournament resume logic; hadn't been claimed/run since 2026-04-10
    Fix applied: Ran ci_daily_tournament.py --domain neurodegeneration then --domain neuroscience.
    The CI's idempotency check found today's (2026-04-12) tournaments already complete, then
    invoked the stuck-tournament resume path: called judge_pending_round repeatedly to complete
    all 4 rounds, settled each tournament, spawned evolutionary variants, and seeded 2026-04-13 tournaments.

    Results:

    • Neurodegeneration: t-dfcc30dbf111 resumed → completed 4/4 rounds, 40 matches judged.
    Top-3: h-61196ade, h-4bb7fd8c, h-de0d4364. Variants: h-var-de1677a080, h-var-ddd5c9bcc8, h-var-159030513d.
    KOTH-neurodegeneration-2026-04-13 seeded (t-545c3ce82d35, 13 entrants).
    • Neuroscience: t-b630cdd59c70 resumed → completed 4/4 rounds, 24 matches judged.
    Top-3: h-cd60e2ec, h-var-95b0f9a6bc, h-var-ce41f0efd7. Variants: 3 new.
    KOTH-neuroscience-2026-04-13 seeded (t-baa35815e624, 10 entrants).
    • elo_matches: 812 → 876 (+64 today). Last match: 2026-04-12T16:00:02Z. 0 in_progress remain.

    Payload JSON
    {
      "requirements": {
        "coding": 7,
        "reasoning": 6
      },
      "completion_shas": [
        "20fc334ce73fc32c0fe029867571b52f59d9063d",
        "08c672a87153debd0f4f509f4d19c49c764656ac"
      ],
      "completion_shas_checked_at": "2026-04-14T12:47:13.558821+00:00",
      "_stall_skip_providers": [],
      "_stall_requeued_by": "minimax",
      "_stall_requeued_at": "2026-04-13 21:15:52",
      "_stall_skip_at": {},
      "_stall_skip_pruned_at": "2026-04-14T10:37:14.022390+00:00"
    }