Every 6 hours, each active market participant evaluates the most decision-relevant hypotheses, with extra weight on hypotheses whose debates/evidence changed recently. Signals should improve price discovery while avoiding duplicate/chaff amplification.
python3 market_participants.py --evaluate-hypotheses [limit]evaluate_hypotheses_batch()Add to market_participants.py:
def evaluate_hypotheses_batch(
db: sqlite3.Connection,
limit: int = 20,
) -> Dict[str, Any]:
"""Evaluate top hypotheses by market volume using all active participants.
Each participant evaluates each hypothesis and emits a buy/sell/hold signal.
Signals are aggregated via believability-weighted averaging, then applied
as price adjustments.
Returns dict with:
- hypotheses_evaluated: count
- price_changes: list of {hypothesis_id, old_price, new_price}
- participant_signals: per-participant signal summary
"""Implementation steps:
market_price, debate/evidence recency, unresolved weak debates)evaluate() method with artifact_type='hypothesis'aggregate_participant_signals but adapted for hypotheses)apply_participant_signals_to_price but for hypotheses, while suppressing duplicate/chaff boostsparticipant_evaluations tableapply_participant_signals_to_hypothesis_price()Variant of apply_participant_signals_to_price() for hypotheses:
hypotheses table (not artifacts)price_history with item_type='hypothesis'_market_consumer_loop()Add to the background loop in api.py:
evaluate_hypotheses_batch(db, limit=20)update_participant_believability(db, lookback_hours=24)Add to market_participants.py CLI:
python3 market_participants.py --evaluate-hypotheses [limit]
python3 market_participants.py --update-believability [hours]
python3 market_participants.py --leaderboardparticipant_believability: participant_id (PK), believability, hit_rate, assessments_count, last_updatedparticipant_evaluations: id, participant_id, artifact_id, artifact_type, signal, magnitude, reason, created_atmarket_participants.py with 5 participant classesparticipant_believability and participant_evaluations tablesapi.py.orchestra/config.yaml keeps the three Claude session-env writable binds and api_backprop_status populates existing_cols from information_schema.columns rows with row[0].origin/main: .orchestra/config.yaml still preserves all three Claude session-env writable binds, and api_backprop_status now builds existing_cols from row[0] over information_schema.columns.%s URL query delimiters in api.py generated links/fetches so the API/UI no longer emits malformed URLs while preserving SQL placeholders and display fallbacks.evaluate_hypotheses_batch(db, limit=20) every 360 cycles and believability updates every 540 cycles.origin/main to drop unrelated branch changes from other tasks..orchestra/config.yaml retains the Claude session-env extra_rw_paths and api_backprop_status reads information_schema.columns rows with row[0].api.py from literal %s to ?.market_participants.py compatibility shim did not dispatch the packaged module CLI when invoked as python3 market_participants.py ....--evaluate-hypotheses commit, and PostgreSQL datetime leaderboard formatting while rebasing this recurring run onto current main.+00:00 suffix caused TypeError when subtracting from naive datetime.utcnow(). Added .replace("+00:00", "") + tzinfo=None normalization._detect_near_duplicate_clusters() function: identifies hypotheses sharing same target_gene+hypothesis_type (cluster rank 1) or same target_gene + Jaccard title similarity > 0.5 (cluster rank 2 with price > 0.55).apply_participant_signals_to_hypothesis_price: line 1516 returned bare float (return old_price) instead of Tuple[Optional[float], List[Dict]]. Callers do new_price, evals = ..., so this would raise TypeError: cannot unpack non-iterable float object whenever a clamped price change was < 0.001. Fixed to return (None, evaluations).--periodic-cycle [limit] CLI mode: runs evaluate_hypotheses_batch then update_participant_believability in sequence, printing a formatted summary. Suitable for cron or direct orchestration invocation.evaluate_hypotheses_batch(): the CTE now computes recency_factor (1.0–2.0) from last_evidence_update or last_debated_at using 45-day decay. The final ORDER BY multiplies base relevance by this factor, so hypotheses debated/updated recently rank higher than equally-priced but stale ones. This satisfies acceptance criterion 1 (prefers activity over raw volume).experiments table does not exist in the current DB schema, causing all participant evaluate() calls to raise sqlite3.OperationalError: no such table: experiments. This silently swallowed all evaluation results (0 price changes, empty errors list) instead of gracefully degrading.Methodologist._get_study_design_signals(), ReplicationScout._score_experiment_replication(), ReplicationScout._score_hypothesis_replication(), and FreshnessMonitor._score_experiment_freshness() all query the experiments table without try/except.experiments table query locations in try/except sqlite3.OperationalError so participants degrade to neutral/0 signals when the table is absent.evaluate_hypotheses_batch now returns 20/20 price changes, 0 errors, and believability updates run correctly on live DB.api.py path calls evaluate_hypotheses_batch() without committing, while api_shared.db rolls back open background-thread transactions before release/reuse. This can make the periodic evaluation report success while losing participant evaluation and price writes.evaluate_artifacts_batch() by committing inside evaluate_hypotheses_batch() and returning/logging commit failures explicitly.market_participants.py CLI entrypoint after package modularization, and made CLI timestamp rendering compatible with native PostgreSQL datetime values.python3 market_participants.py --periodic-cycle 20: evaluated 20 hypotheses, applied 17 price changes, downranked 188 duplicates, and updated believability for 6 participants.python3 -m py_compile scidex/exchange/market_participants.py market_participants.py, python3 market_participants.py --leaderboard, and python3 market_participants.py --evaluate-hypotheses 1 after replacing deprecated UTC timestamp calls; all passed without deprecation warnings.{
"requirements": {
"coding": 7,
"analysis": 6,
"safety": 9
},
"completion_shas": [
"67c8fa936c329d885d467a305137f81adb62b2dc",
"86e354b16bdc7bf5b9563cd595f3fb8cc3f469ee",
"029d8c9f770c513362f7b25793d55060a02ce8fa",
"50d1ab055b181cee6505aa1dfafd6689f9d7504d"
],
"completion_shas_checked_at": "2026-04-13T00:41:27.109902+00:00",
"completion_shas_missing": [
"9f13dece50d48f21440399d1c58ff94f4d5b002a",
"979f496b1a9afd2ca0e8574b94e4ae1544709336",
"079bae9fd8973cde4d7036a2f34cc769a19091fc",
"af4157e883caf12b9c38ec6a13740ca4d2b03fa3",
"88f5c14a3fa159e3a8b171b9fc6ddac11d65f420",
"fa1a745d2fc2087edf98fc1c713190360450686a",
"1a000c44fbf11ef31a2d5fc84df2e8179d45ecc7",
"78ba2b7b94eeb40fdbf4c8beeaa10312173f2c94",
"aff58c892eaa365c2cd0a24181fdf7ba364fa99c",
"e3535236b74e170c90eb8d8e233924aac65ab0a2",
"fba7d62be56ecd35b5416ca7ac6b9924a948fdc6",
"928a836aba9ba683024e700e45819f70dc3a1d12",
"062e628bad2c7d421448ec5d568e8c49c59dca4d",
"a341b5f27a885f4bd63fe41a06d7679b2d813b5b",
"4d6cc1d77132805a6fbfa03b6a36344f92318c8e",
"167835a4d5f88195a0513a9960449c7ab711593b",
"76baab518370b68366543b1b103e43c965afaef7",
"6881fb12f060cebc892976ba191ca8728b158d13",
"748af5bb3c3719e444fbedf9f5e75e8ca1b70b56",
"fd5fc1484187f89a9dc802a40b47a4659349dff9"
]
}