[Senate] Rebuild theme S3: search-index coverage verification as a continuous process

Rebuild spec — follow docs/planning/specs/rebuild_theme_template_spec.md first.

Theme anchor

Theme: S3 — FTS / vector index coverage verification
Layer: Senate
Full description: docs/design/retired_scripts_patterns.md → S3

Why this matters now

The 3 deleted scripts under this theme (ci_verify_search_indexes.py, migrate_fts5.py, rebuild_fts.py) were SQLite-FTS-specific. SciDEX
is now PG. The recurring task [Search] CI: Verify search indexes are current (daily, 3a897f8a-0712-4701-a675-07f0670d8f87) currently has
no implementation to call.

This is a narrow, well-scoped theme — a good second-rebuild after AG1
to exercise the pattern on a non-LLM-heavy process. Most of the work
here is observability + self-healing rebuild, not semantic judgment.

Template fills

{{THEME_ID}} = S3
{{THEME_NAME}} = search-index coverage verification
{{LAYER}} = Senate
{{LAYER_SLUG}} = senate
{{THEME_SLUG}} = search_index_coverage
{{CADENCE}} = hourly
{{CORE_JUDGMENT}} = "is any search index materially stale or

out-of-sync with its source table?"

{{GAP_PREDICATE}} = (source_row_count - indexed_row_count) /

source_row_count > drift_threshold OR

last_rebuilt_at < now() -
  stale_threshold

— both thresholds from theme_config.

Where LLMs fit here (narrow)

Most of this theme is deterministic (count rows, compare, rebuild).
The LLM touch point: when a rebuild fails unexpectedly, an LLM
judges the failure message and proposes a remediation (missing
extension, permission error, tsvector config mismatch, etc.) before
escalating to a human. This keeps the process self-healing for the
common failure modes.

Self-describing registry

No hardcoded list of (source_table, index_name) pairs in code.
Instead, introspect PG:

pg_indexes filtered to indexname LIKE '%_fts' OR '%_embed' OR

'%_tsvector'.

Pair each with its backing source table by naming convention

(discovered, not hardcoded).

Operators add new indexes by creating them in PG with the naming

convention; the coverage checker picks them up automatically.

This is principle #2 (discover schema) in pure form.

Outcome feedback

Search-usage metrics: which queries return results after rebuild,

which still return empty. If a query-category consistently returns
empty even after rebuild, it's a ranking/embedding-model issue, not
a coverage issue — flag for a different theme.

Drift rate over time: if drift accumulates faster than hourly cycles

can absorb, cadence should adapt upward (self-calibration).

Acceptance

All template criteria, plus:

☐ No hardcoded list of indexes. Uses pg_indexes introspection.

☐ Drift threshold + stale threshold in theme_config.

☐ Recurring task 3a897f8a-0712-4701-a675-07f0670d8f87 reassigned

to this process.

☐ Self-healing: a rebuild that fails triggers the LLM-remediation

path (one bounded retry with LLM-suggested fix) before
escalating.

File: rebuild_theme_S3_search_index_coverage_spec.md

Modified: 2026-04-25 22:00

Size: 3.0 KB