The platform now hosts 310+ hypotheses and 18K wiki pages, and many of
them make claims that directly contradict each other (e.g. one
hypothesis says TREM2 inhibits microglial phagocytosis; another says it
enhances it). Today nothing surfaces these contradictions —
gflownet_sampler rewards diversity, but diversity is not the same as
consistency. Build an engine that, for each pair of normalized atomic
claims (across hypotheses, wiki, debates), detects logical
contradiction (A says X, B says ¬X) via embedding-similarity
shortlisting + LLM contradiction judge, and emits a structured
claim_contradiction artifact that auto-spawns a debate to adjudicate.
Effort: thorough
scidex/senate/claim_consistency.py::scan(window_days=7, max_pairs=500) -> dict iterates atomic claims (from claim_artifacts table populated by wiki_claim_extractor + new hypothesis claims; if claim_artifacts doesn't exist, create a unified view).scidex/atlas/canonical_entity_links.py or scidex/agents/embeddings.py); only consider pairs where cosine ≥ 0.65 AND polarities differ."Do these two claims directly contradict? <a>...<b>..." returning {contradicts: bool, confidence: float, common_subject: str, divergent_predicate: str}. Triple-redundancy via 3 personas (skeptic, methodologist, domain-expert) — verdict = mode.migrations/20260428_claim_contradictions.sql: claim_contradictions(id, claim_a_id, claim_b_id, common_subject, divergent_predicate, confidence, status ENUM('open','adjudicated','reconciled','withdrawn'), spawned_debate_id TEXT, detected_at); UNIQUE(LEAST(a,b), GREATEST(a,b)).confidence ≥ 0.8, spawn a debate via scidex.agora.debate.create_debate(topic=common_subject, position_a=claim_a, position_b=claim_b, seed_evidence=union(claim_a.evidence, claim_b.evidence)); write spawned_debate_id back. Reuse the pattern from refutation_emitter.py.status='adjudicated'; the losing claim's host artifact gets a disputed quality tag and a 5 % composite_score dock.GET /api/senate/contradictions?status=open&limit=50 returns the open contradictions for a triage UI; GET /senate/contradictions HTML page lists them with side-by-side diff.q-live-tau-biology-pipeline-status etc. embed a "Contradictions in this domain" badge on landing pages, sourced from this table.tests/test_claim_consistency.py: synthetic pair "TREM2 enhances microglial activation" vs "TREM2 inhibits microglial activation" → contradiction detected, verdict ≥ 0.8; near-paraphrase pair → not flagged; trivially-different-subject pair → not flagged (common_subject mismatch).claim_artifacts_v Postgres view.embedding column on the view's source tables if missing; backfill via the existing embedding pipeline. Stage 1 then becomes a single pgvector <=> query.synthesis_engine.synthesize_debate_session post-step.q-qual-auto-fact-check-pipeline — provides claim verdicts that influence polarity.q-qual-claim-strength-normalizer — produces normalized claim subject/predicate.scidex/agents/embeddings.py — pgvector embedding pipeline.q-mem-persona-reputation-log — debate outcomes feed persona track records.2026-04-28 — Implementation complete
migrations/20260428_claim_contradictions.sql: claim_contradictions table + history + audit trigger. Uses functional unique index (CASE-based) for unordered pair deduplication since PostgreSQL doesn't support LEAST/GREATEST in constraint definitions. contradiction_status enum: open/adjudicated/reconciled/withdrawn.scidex/senate/claim_consistency.py: polarity inference (keyword-based), cosine similarity shortlist (cosine≥0.65, opposite polarity, top-K=10), triple-redundancy LLM judge (skeptic/methodologist/domain-expert personas), mode voting + mean confidence, auto-debate spawning via SciDEXOrchestrator.run_debate(), adjudication with 5% loser penalty, scan() and get_contradictions() APIs.api_routes/senate.py: POST /api/senate/contradictions/scan, GET /api/senate/contradictions, GET /api/senate/contradictions/stats.site/contradictions.html: triage UI with stats grid, scan button, filter tabs, side-by-side claim diff cards, live search.GET /senate/contradictions route in api.py.tests/test_claim_consistency.py: 32 tests (all passing). Covers polarity inference, opposite polarity detection, cosine similarity, judge response parsing (including markdown fence stripping), mode voting, mean confidence, prompt format.scan() computes embeddings via SentenceTransformer directly. Badge embedding on domain pages (AC-8) not yet implemented.