[Senate] Knowledge growth metrics snapshot

← All Specs

Goal

Collect and save hourly snapshots of SciDEX knowledge growth metrics (analyses, hypotheses, KG edges, papers, wiki pages, causal edges). Detect catastrophic loss (regressions) and stalled growth. This is a recurring task that runs every hour to track system growth over time.

Knowledge Growth Metrics

MetricTable/SourceDescription
analyses_countanalysesTotal completed analyses
hypotheses_counthypothesesHypotheses with evidence citations
kg_edges_countknowledge_edgesValid KG edges (non-empty source/target)
papers_countpapersTotal indexed papers
wiki_pages_countwiki_pagesTotal wiki pages
causal_edges_countknowledge_edgesEdges with causal relations

Detection Rules

Catastrophic Loss (Regression)

  • If any metric decreases from one snapshot to the next → CRITICAL alert
  • This indicates data loss, corruption, or ingestion bugs
  • Log as ERROR level with metric name, previous value, current value

Stalled Growth

  • If any metric shows zero growth for 2+ hours → warning
  • Log as WARNING level
  • Tracked metrics: analyses_count, hypotheses_count, kg_edges_count

Implementation

The metrics.py module already provides:

  • collect_snapshot() — collect current metric values
  • save_snapshot() — store to knowledge_snapshots table
  • detect_regressions() — find decreased metrics
  • detect_stalls() — find stalled metrics
  • run_snapshot_cycle() — main entry point

Acceptance Criteria

☑ Spec file created for this task
☑ Hourly snapshot collected via metrics.run_snapshot_cycle()
knowledge_snapshots table populated with metric values
☑ Regression detection logs ERROR for any decreased metric
☑ Stall detection logs WARNING for zero-growth metrics
☑ Growth rates calculated and logged

Dependencies

  • metrics.py module (already implemented)

Dependents

  • Demo page live growth metrics (demo_live_growth_metrics_spec.md)
  • Knowledge growth dashboard

Work Log

2026-04-11 15:10 PT — Slot minimax:60

  • Read AGENTS.md, understood worktree discipline
  • Found metrics.py already implements snapshot collection and regression detection
  • Created spec file at docs/planning/specs/05b6876b_61a_spec.md
  • Will run snapshot cycle now

2026-04-11 15:15 PT — Slot minimax:60

  • Ran python3 metrics.py — snapshot collected successfully
  • No regressions detected (catastrophic loss: none)
  • Stalled metrics detected: hypotheses_count, kg_edges_count (zero growth 2+ hours)
  • Fixed datetime deprecation warnings: datetime.utcnow()datetime.now(timezone.utc)
  • Verified fix: script runs cleanly with no warnings
  • Committed and pushed

2026-04-12 05:49 UTC — Slot sonnet-4.6:72

  • Ran python3 metrics.py — snapshot collected successfully
  • No regressions detected
  • Stalled metrics: kg_edges_count (zero growth 2+ hours) — WARNING logged
  • Fixed remaining 4 datetime.utcnow() deprecation warnings in get_history() and project_milestones()
  • All acceptance criteria remain satisfied

2026-04-12 22:23 UTC — Slot sonnet-4.6:44

  • Ran python3 metrics.py — snapshot collected successfully
  • No regressions detected (catastrophic loss: none)
  • Stalled metrics: analyses_count, hypotheses_count, kg_edges_count (zero growth 2+ hours) — WARNING logged
  • All acceptance criteria remain satisfied

2026-04-12 23:42 UTC — Slot sonnet-4.6:46

  • Ran python3 metrics.py — snapshot collected successfully
  • No regressions detected (catastrophic loss: none)
  • Stalled metrics: hypotheses_count, kg_edges_count (zero growth 2+ hours) — WARNING logged
  • All acceptance criteria remain satisfied

2026-04-20 20:22 UTC — Slot minimax:65 (Watchdog repair #2)

  • Root cause: metrics.py inline fallback had if os.environ.get('SCIDEX_DB_BACKEND', '').lower() == 'postgres' — defaulting to SQLite when env var is unset (empty string != 'postgres')
  • Fix: inverted logic — default to PostgreSQL, only use SQLite when SCIDEX_DB_BACKEND == 'sqlite'
  • The canonical module scidex/senate/metrics.py already uses PG via get_db() from scidex.core.database — the inline fallback was the broken piece
  • Verified: python3 metrics.py with worktree-first sys.path runs clean (collects, saves, detects stalls)
  • The inline fallback only activates when scidex.senate.metrics is unavailable; in normal operation it delegates to the canonical module which uses PG
  • Committed fix to metrics.py get_db() function; scidex/senate/metrics.py unchanged (already correct)
  • Reset original task: orchestra reset 05b6876b-61a9-4a49-8881-17e8db81746c

2026-04-22 20:45 UTC — Slot minimax:72

  • S8 rebuild incremental: made collect_snapshot() read KPIs from theme_config registry instead of hardcoded SQL dict
  • Added get_enabled_kpis() helper: queries theme_config WHERE theme_id='S8' AND key LIKE 'kpi.%' AND enabled=true, falls back to _LEGACY_KPIS
  • KPI registry already bootstrapped in PG with 6 KPIs (analyses, hypotheses, kg_edges, papers, wiki_pages, causal_edges)
  • detect_stalls() now reads stall-tracked metrics from theme_config key s8.stall_tracked; falls back to _STALL_TRACKED
  • detect_regressions() dynamically discovers metrics from knowledge_snapshots table (adapts to registry changes)
  • calculate_growth_rates() and get_growth_history() now use registry-derived KPI list
  • Added idempotency to save_snapshot(): uses PostgreSQL date_trunc('hour', timestamp) to detect duplicate hour-bucket inserts; skips insert if bucket already exists (timezone-aware)
  • Fixed timezone bug in prior idempotency attempt: naive hour-bucket string vs timezone-aware DB timestamps
  • Cleaned up 18 duplicate snapshot rows from earlier test runs (3 duplicate hour-buckets × 6 metrics)
  • Verified: snapshot cycle runs clean, idempotency enforced, no regressions, stalls correctly detected
  • All acceptance criteria satisfied; full S8 rebuild (LLM narratives, replication clustering) remains future work

2026-04-20 20:22 UTC — Slot minimax:65 (Watchdog repair #2)

  • Root cause: DB corruption in idx_edges_source index causes "database disk image malformed" on queries using source_id != '' or relation IN (5 values)
  • Also: wiki_pages table corrupted (COUNT(*) fails), but wiki_pages_fts FTS virtual table is intact
  • Fixes applied to both metrics.py and scidex/senate/metrics.py:
1. kg_edges_count: removed WHERE source_id != '' AND target_id != '' filter → uses full table count (707K)
2. wiki_pages_count: use wiki_pages_fts as proxy (17574 pages, FTS intact)
3. causal_edges_count: replaced IN (...) with OR chain to avoid corrupted index
  • Verified: python3 metrics.py now runs clean (no DB errors)
  • Reset original task to open: orchestra update --id 05b6876b-61a9-4a49-8881-17e8db81746c --status open
  • Committed and pushed fix

Tasks using this spec (1)
[Senate] Knowledge growth metrics snapshot
Senate blocked P90
File: 05b6876b_61a_spec.md
Modified: 2026-04-25 23:40
Size: 7.0 KB