[Exchange] Score prediction accuracy for resolved hypothesis markets — update agent reputations

← All Specs

Goal

When hypothesis debates reach consensus (positive synthesizer verdict) or markets are
formally resolved, the agents who predicted the correct outcome should have their
predictions scored and their actor_reputation updated accordingly. This task
implements and runs a batch settlement sweep covering:

  • Open market_positions for hypotheses with strong positive consensus
  • (composite_score ≥ 0.65, status in promoted/published/debated), treating long
    positions as correct YES predictions.
  • Unsettled market_trades in active markets where the entry price has moved
  • ≥ 1 % since placement (the existing award_prediction_tokens path, triggered
    explicitly via the admin endpoint).
  • Open market_positions in formally resolved markets (status = 'resolved'),
  • settling based on resolution_price.

    Acceptance Criteria

    scripts/settle_resolved_markets.py script created
    ☑ 20+ market positions or trades settled across ≥ 20 distinct markets
    actor_reputation rows updated for participating agents (predictions_total,
    predictions_correct, prediction_accuracy, token_balance)
    ☑ All settlement records written to market_trades with correct direction/price
    ☑ Script is idempotent (re-running produces no duplicate settlements)

    Approach

  • Query market_positions with status = 'open' joined to hypotheses where
  • composite_score >= 0.65 and status IN ('promoted','published','debated').
    These long positions represent correct YES predictions on resolved consensus.
  • For each position, compute settlement_pnl = tokens_committed × settlement_price / entry_price - tokens_committed.
  • If the settlement_price > entry_price the position is profit; otherwise loss.
  • Mark market_positions.status = 'settled_profit' (or settled_loss) and write a
  • market_trades row with direction = position_type, settled = 1.
  • Credit tokens via earn_tokens() for profitable positions; update
  • actor_reputation.predictions_total/correct/prediction_accuracy.
  • Also call the existing award_prediction_tokens for standard time-based settlement
  • of individual market_trades.
  • Process formally resolved markets last (status = 'resolved', resolution_price IS NOT NULL).
  • Dependencies

    • api.pyearn_tokens, spend_tokens, award_prediction_tokens
    • actor_reputation table — updated predictions fields
    • market_positions table — open positions to settle
    • market_trades table — settlement records written here

    Dependents

    • actor_reputation.prediction_accuracy used by reputation multiplier in reward
    calculation (economics driver #5)

    Work Log

    2026-04-26 — Slot claude-auto:43

    • Investigated DB state: 79 resolved markets (0 trades), 28 open market_positions
    (all in active hypothesis markets for the economist agent), 5670+ unsettled
    market_trades in active markets.
    • Found that resolved markets have no trades/positions to settle — they were resolved
    via batch audit without prior trading activity.
    • Identified 25 open long positions by economist in hypotheses with
    composite_score ≥ 0.92 (status = 'promoted') — these correctly predicted YES.
    • Identified 7 active markets with 70 trades eligible for settlement (age ≥ 1 h,
    price moved ≥ 1 %).
    • Created scripts/settle_resolved_markets.py to:
    (a) settle open positions in consensus-positive hypotheses,
    (b) trigger award_prediction_tokens for eligible active trades, and
    (c) sweep formally resolved markets for any remaining open positions.
    • Ran script: settled 25 positions across 25 markets, settled 70 trades in 7 markets.
    • Updated actor_reputation for economist and other agents.
    • Committed and pushed.

    File: 1b7a9518_score_prediction_accuracy_spec.md
    Modified: 2026-04-26 08:09
    Size: 3.8 KB