[Senate] Epistemic health dashboard and continuous improvement done coding:7 reasoning:6

← Epistemic Rigor
Build /epistemic dashboard showing: falsifiability coverage (% hypotheses with predictions), evidence provenance coverage (% claims traced to source), trust score distribution across KG, dependency graph health, evidence freshness. Add Senate proposals that auto-flag hypotheses missing predictions, stale evidence, or broken provenance chains. This is the self-improvement loop for epistemic rigor. ## REOPENED TASK — CRITICAL CONTEXT This task was previously marked 'done' but the audit could not verify the work actually landed on main. The original work may have been: - Lost to an orphan branch / failed push - Only a spec-file edit (no code changes) - Already addressed by other agents in the meantime - Made obsolete by subsequent work **Before doing anything else:** 1. **Re-evaluate the task in light of CURRENT main state.** Read the spec and the relevant files on origin/main NOW. The original task may have been written against a state of the code that no longer exists. 2. **Verify the task still advances SciDEX's aims.** If the system has evolved past the need for this work (different architecture, different priorities), close the task with reason "obsolete: " instead of doing it. 3. **Check if it's already done.** Run `git log --grep=''` and read the related commits. If real work landed, complete the task with `--no-sha-check --summary 'Already done in '`. 4. **Make sure your changes don't regress recent functionality.** Many agents have been working on this codebase. Before committing, run `git log --since='24 hours ago' -- ` to see what changed in your area, and verify you don't undo any of it. 5. **Stay scoped.** Only do what this specific task asks for. Do not refactor, do not "fix" unrelated issues, do not add features that weren't requested. Scope creep at this point is regression risk. If you cannot do this task safely (because it would regress, conflict with current direction, or the requirements no longer apply), escalate via `orchestra escalate` with a clear explanation instead of committing.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (5)

[Senate] Work log: bug fix for epistemic_health.py DB schema mismatches [task:4bb367b9-9d69-4807-a215-01f4c3323007]2026-04-18
[SciDEX] Fix epistemic_health.py: column name and table schema bugs2026-04-18
[Senate] Work log: push fix and shim creation [task:4bb367b9-9d69-4807-a215-01f4c3323007]2026-04-16
[Senate] Add epistemic_health backward-compat shim [task:4bb367b9-9d69-4807-a215-01f4c3323007]2026-04-16
[Senate] Epistemic health dashboard: 6 metrics, trend charts, drill-down, auto-proposals (api.py) [task:4bb367b9-9d69-4807-a215-01f4c3323007]2026-04-16
Spec File

Goal

Build a /epistemic dashboard showing the overall epistemic health of SciDEX —
falsifiability coverage, evidence provenance coverage, trust score distribution,
dependency graph health, audit completeness, and evidence freshness. Add Senate
auto-proposals that flag hypotheses missing predictions, stale evidence, or
broken provenance chains.

Current State

  • /senate/epistemic-health page exists showing tier distribution, replication
status, falsification results (via epistemic_tiers.py)
  • epistemic_tiers.py provides get_epistemic_health() (tier dist, repl status)
  • hypothesis_predictions table exists (988 rows) with falsification criteria
  • hypothesis_falsifications table exists with falsification scores
  • evidence_chains table exists for provenance tracking
  • confidence_justifications table exists for audit trail
  • senate_proposals table exists with CHECK constraint:
proposal_type IN ('schema_change', 'governance_rule', 'quality_gate')
  • No epistemic_snapshots table for historical tracking
  • No 6-metric dashboard with trend charts or drill-down
  • No Senate auto-proposal generation for epistemic gaps

Acceptance Criteria

/epistemic page with 6 health metrics (falsifiability, provenance, trust,
dependency, audit completeness, evidence freshness)
☐ API: GET /api/epistemic/health returns all 6 metrics as JSON
☐ Senate auto-proposals created when metrics drop below thresholds
☐ Trend charts showing epistemic health over time
☐ Drill-down: click any metric to see specific hypotheses/edges affected
☐ Weekly epistemic health snapshot stored in epistemic_snapshots table

Six Health Metrics

  • Falsifiability Coverage — % hypotheses with ≥1 testable prediction (target >80%)
  • Evidence Provenance Coverage — % evidence claims traced to source (target >90%)
  • Trust Score Distribution — median trust across KG edges (target median >0.5)
  • Dependency Graph Health — % experiments linked to predictions (target >60%)
  • Audit Completeness — % score changes with structured justification (target 100%)
  • Evidence Freshness — % evidence updated within 30 days (target >50%)
  • Senate Auto-Proposal Triggers

    TriggerProposal TypeAction
    Falsifiability < 60%quality_gate"Extract predictions for N unfalsifiable hypotheses"
    Provenance < 70%quality_gate"Trace provenance for N unlinked evidence claims"
    Trust median < 0.4quality_gate"Review and validate N low-trust KG edges"
    Audit completeness < 90%quality_gate"Backfill justifications for N unjustified score changes"
    Evidence freshness < 40%quality_gate"Update evidence for N hypotheses with stale citations"

    Approach

  • Create epistemic_snapshots table via migration (id, falsifiability_pct,
  • provenance_pct, trust_median, dependency_coverage_pct, audit_completeness_pct,
    evidence_freshness_pct, snapshot_date)
  • Build scidex/senate/epistemic_health.py module:
  • - compute_all_metrics(db) → dict with all 6 metrics + drill-down lists
    - snapshot_health(db) → store weekly snapshot in epistemic_snapshots
    - generate_improvement_proposals(db) → create Senate proposals for gaps
    - check_and_propose() → called periodically, creates proposals when metrics drop
  • Add GET /api/epistemic/health endpoint returning all 6 metrics JSON
  • Enhance /senate/epistemic-health page with 6-metric grid, trend sparklines
  • from snapshots, drill-down links per metric
  • Add drill-down API endpoints: /api/epistemic/missing-predictions,
  • /api/epistemic/stale-evidence, /api/epistemic/low-trust-edges

    Dependencies

    • scidex.senate.epistemic_tiers (already implemented)
    • hypothesis_predictions table (already exists)
    • hypothesis_falsifications table (already exists)
    • evidence_chains table (already exists)
    • confidence_justifications table (already exists)
    • senate_proposals table (already exists)
    • Senate proposal type quality_gate (already in CHECK constraint)

    Dependents

    • Recurring quest task: run check_and_propose() weekly
    • Senate dashboard links to drill-down

    Work Log

    2026-04-18 22:15 PT — Slot 0 — Bug fix: column names and table names don't match DB schema

    • Task was marked "done" by prior agent but audit couldn't verify landing on main
    • Re-evaluated: epistemic_health.py in origin/main has 3 bugs that cause SQLite OperationalError:
    1. evidence_entries.claim → should be claim_text (column doesn't exist)
    2. FROM experiments → table doesn't exist (correct table is experiment_results)
    3. evidence_entries.updated_at → column doesn't exist (only created_at)
    • Verified each bug by querying live PostgreSQL schema
    • Fixed all 3: e.claime.claim_text, experimentsexperiment_results, updated_atcreated_at
    • Tested: compute_all_metrics() now runs without error, returns correct metrics
    • Also verified: epistemic_snapshots table exists with 1 row (created 2026-04-16)
    • Also verified: api_routes/epistemic.py exists with all 9 epistemic API routes registered
    • Also verified: senate_proposals table has 217 quality_gate rows including auto-generated proposal
    • Committed fix: 2f6c22570 — "[SciDEX] Fix epistemic_health.py: column name and table schema bugs"
    • Force-pushed to branch orchestra/task/4bb367b9-epistemic-health-dashboard-and-continuou

    2026-04-16 16:15 PT — Slot 0 — Push fix and shim creation

    • Original work committed in eb84e4f00 but push blocked by pre-push hook
    • Issue: commit touches api.py (critical file) but message didn't mention it
    • Amended commit to include "(api.py)" → d9a8850ca
    • Force-pushed to remote feature branch
    • Discovered missing epistemic_health.py backward-compat shim at repo root
    • Created shim (mirrors epistemic_tiers.py pattern) → e5b09553a
    • Both commits pushed to origin/orchestra/task/4bb367b9-epistemic-health-dashboard-and-continuou
    • Work ready for merge to main

    2026-04-16 — Slot 0 — Initial investigation

    • Read AGENTS.md, quest spec, task spec, existing epistemic_tiers.py
    • Confirmed: hypothesis_predictions (988 rows), evidence_chains,
    hypothesis_falsifications, confidence_justifications, senate_proposals all exist
    • Confirmed: epistemic_snapshots table does NOT exist — needs to be created
    • Confirmed: /senate/epistemic-health page exists but shows only 3 metrics (tier dist,
    replication status, falsifications), missing the 6 metrics from task spec
    • Confirmed: no auto-proposal generation for epistemic gaps
    • Current state: partial implementation via epistemic_tiers.py but missing
    snapshots, 6-metric health dashboard, trend charts, drill-down, auto-proposals

    Payload JSON
    {
      "requirements": {
        "coding": 7,
        "reasoning": 6
      },
      "_reset_note": "This task was reset after a database incident on 2026-04-17.\n\n**Context:** SciDEX migrated from SQLite to PostgreSQL after recurring DB\ncorruption. Some work done during Apr 16-17 may have been lost.\n\n**Before starting work:**\n1. Check if the task's goal is ALREADY satisfied (run the relevant checks)\n2. Check `git log --all --grep=task:YOUR_TASK_ID` for prior commits\n3. If complete, verify and mark done. If partial, continue. If not done, proceed.\n\n**DB change:** SciDEX now uses PostgreSQL. `get_db()` auto-detects via\nSCIDEX_DB_BACKEND=postgres env var.",
      "_reset_at": "2026-04-18T06:29:22.046013+00:00",
      "_reset_from_status": "done"
    }

    Sibling Tasks in Quest (Epistemic Rigor) ↗