[Senate] Continuous architecture review — holistic codebase audit and API consistency

Task ID: 7ef11ece-a3e0-4a0f-9bb7-5636bf55b991 Layer: Senate Priority: 85 Type: one-time (recurring pattern)

Goal

Establish a recurring architecture review process for SciDEX by conducting a comprehensive audit of the codebase. Identify dead code, undefined variables, duplicate routes, inconsistent patterns, and gaps in API coverage. Plan consolidation of fragmented scripts and separation of concerns. This creates a living architecture health report that guides continuous improvement.

Acceptance Criteria

☐ Complete audit of api.py (4000+ lines) documenting: dead code, undefined variables, duplicate routes, inconsistent patterns

☐ API coverage report: identify all data entities and their REST endpoints (GET/POST/PUT/DELETE), flag missing JSON APIs

☐ Consolidation plan: document fragmented scripts (import_.py, populate_.py, migrations/*.py) and propose unified framework

☐ Separation of concerns analysis: plan for template extraction, identify business logic in route handlers

☐ Data safety verification: confirm no running worktrees depend on refactoring targets, verify DB column preservation

☐ Review QUESTS.md, AGENTS.md, self-improve.md, pitch.html to understand system goals

☐ Detailed refactoring plan with rollback strategy and incremental change approach

☐ Architecture health report document created

☐ Task re-created for next review cycle

Approach

Read system documentation:

- QUESTS.md - understand layer goals
- AGENTS.md - understand development standards
- self-improve.md - understand agent workflow
- pitch.html - understand SciDEX vision

Audit api.py:

- Scan for undefined variables, dead code
- Identify duplicate routes
- Map route patterns and inconsistencies
- Check error handling coverage

API coverage analysis:

- List all data entities from DB schema
- Map existing JSON endpoints
- Identify entities without proper APIs
- Flag HTML-only endpoints needing JSON equivalents

Script consolidation analysis:

- Inventory import_.py, populate_.py, migrations/*.py scripts
- Document purpose and overlap
- Propose unified migration/import framework

Separation of concerns:

- Analyze inline HTML/CSS in api.py
- Identify business logic in route handlers
- Plan template extraction approach
- Map service layer opportunities

Data safety check:

- Query for active worktrees
- Review DB schema for preservation
- Plan rollback strategies

Create architecture health report:

- Document all findings
- Prioritize issues
- Propose incremental refactoring plan

Re-create task for next cycle

Work Log

2026-04-02 00:20 PT — Slot 4

Started task: Architecture review
Created spec file following AGENTS.md format
Read system documentation: QUESTS.md, AGENTS.md, self-improve.md, pitch.html
Understood five-layer vision (Agora/Exchange/Forge/Atlas/Senate) and development standards

2026-04-02 00:30 PT — Slot 4 (Comprehensive Audit)

Launched two Explore agents for parallel analysis:

1. Code quality audit of api.py (4,737 lines, 45 routes, 55 functions)
2. API coverage analysis across 16 database entities
3. Script consolidation analysis (9 primary scripts, 15 migrations)

Code Quality Findings:

1 CRITICAL: Missing logging import (lines 2116, 2127, 2134, 2140, 2144, 2343, 2598)
1 CRITICAL: Duplicate /api/coverage route (lines 383, 818) - second overrides first
11 HIGH: Routes without error handling (will crash on DB failures)
8+ HIGH: Inline imports violating PEP 8
15+ MEDIUM: Bare except: clauses (should use specific exception types)
MEDIUM: Response format inconsistencies (stats field missing in graph/{analysis_id})
MEDIUM: Pagination inconsistencies across endpoints
MEDIUM: No DB connection cleanup anywhere
611 instances of inline styles (should use templates)
394-line route handler (analysis_detail) needs splitting

API Coverage Findings:

✅ 4 entities with complete coverage: Events, Feedback, Notebooks, Forge Tool Chains
⚠️ 6 entities with partial coverage: Papers, Hypotheses, Analyses, Gaps, Edges, Skills
❌ 6 entities with zero JSON API: Debate Sessions/Rounds, Agent Performance, Market Transactions, Price History, Wiki Pages/Entities
Critical gaps: Individual detail endpoints for hypotheses and analyses (HTML-only)
Missing CRUD: Most entities only support GET, no POST/PUT/DELETE

Script Consolidation Findings:

9 primary scripts identified (2 imports, 2 population, 5 backfills)
15 migration files with no versioning framework
Duplicate functionality: populate_papers.py vs backfill_papers.py
Destructive script: populate_tool_registry.py deletes all skills on re-run
Fragmentation: Each script has own DB connection, error handling, logging
No CLI interface: Hard-coded paths, no progress reporting

2026-04-02 00:45 PT — Slot 4 (Report Generation)

Compiled comprehensive architecture health report: docs/planning/architecture_health_report_2026-04-02.md
Report includes:

- Executive summary with prioritized findings
- Detailed code quality audit with line numbers
- Complete API coverage analysis by entity
- Script consolidation recommendations
- 5-phase refactoring roadmap (stability → quality → architecture)
- Data safety verification checklist
- Rollback strategy

Total report length: 500+ lines covering all findings

Refactoring Roadmap:

Phase 1 (Week 1): Critical stability fixes (logging import, duplicate route, error handling)
Phase 2 (Week 2): Stability improvements (DB cleanup, imports, exception handling)
Phase 3 (Week 3-4): API consistency (missing endpoints, standardization)
Phase 4 (Week 5-6): Code quality (templates, handler splitting, migration framework)
Phase 5 (Month 2+): Architecture evolution (Tier 2 APIs, caching, Alembic)

Result (Cycle 1 — 2026-04-02)

✓ Complete - Comprehensive architecture health report generated with actionable findings across code quality, API coverage, and script consolidation. Report provides prioritized 5-phase refactoring roadmap emphasizing stability first, then quality, then architectural improvements. All findings documented with line numbers and specific recommendations. Ready for incremental implementation.

---

Work Log — Cycle 2 (2026-04-25)

Context

api.py has grown from ~4,737 lines (Cycle 1) to 77,961 lines — a 16x increase. Re-audited all dimensions to capture current state.

1. Route Audit — 598 total routes

Method breakdown: 448 GET · 140 POST · 7 DELETE · 2 PATCH · 1 PUT

Prior findings resolved:

✅ Duplicate /api/coverage route — fixed (no longer present)
✅ Missing logging import — fixed (6 occurrences now)
✅ No exact duplicate routes remain

New findings:

#	Severity	Finding	Detail
F1	CRITICAL	`/openapi.json` returns HTTP 500	`AssertionError: A response class is needed to generate OpenAPI` — 38 routes return `Response()`, `RedirectResponse`, `FileResponse`, or `StreamingResponse` without declaring `response_class`. This breaks all auto-generated API documentation.
F1b	CRITICAL	7 shadowed routes	Parameterized routes registered before static routes capture traffic, making static routes unreachable. See shadowed routes table below.
F1c	HIGH	Dead route: `/paper/{paper_id}` (L54982)	`:path` converter version at L25024 captures all traffic. The `authored_paper_detail` handler is unreachable dead code.
F2	HIGH	28 bare `except:` clauses	Up from 15 in Cycle 1. These catch `KeyboardInterrupt` and `SystemExit` silently.
F3	HIGH	Singular/plural route inconsistency	11 entity types have both `/api/{noun}` and `/api/{noun}s` routes. Also different param names (`hyp_id` vs `hypothesis_id`).
F4	MEDIUM	38 routes with undeclared return types	Routes returning `Response()`, `RedirectResponse`, `FileResponse`, `StreamingResponse` without `response_class`. Root cause of F1.
F5	MEDIUM	408 routes lack `response_class` and return type annotations	FastAPI infers JSON for most, but the absence makes OpenAPI generation fragile.
F6	LOW	3 trailing-slash duplicate registrations	`/figures`, `/figures/{analysis_id}`, `/hypotheses` each registered with and without trailing slash — redundant.

Shadowed routes (F1b) — static routes unreachable due to parameterized route ordering:

Shadowing Route (captures traffic)	Shadowed Route (UNREACHABLE)
`GET /api/wiki/{slug}` (L11370)	`GET /api/wiki/quality-scores` (L76423)
`GET /api/markets/{market_id}` (L21120)	`GET /api/markets/correlations` (L21665)
`GET /api/markets/{market_id}` (L21120)	`GET /api/markets/activity` (L22018)
`GET /api/artifacts/{artifact_id}` (L23174)	`GET /api/artifacts/search` (L23725)
`GET /api/exchange/bids/{agent_id}` (L31952)	`GET /api/exchange/bids/open` (L31964)
`GET /api/exchange/bids/{agent_id}` (L31952)	`GET /api/exchange/bids/top` (L31980)
`GET /portfolio/{agent_id}` (L32142)	`GET /portfolio/leaderboard` (L32293)

Fix: Move static routes BEFORE parameterized counterparts, or add include_in_schema=False and restructure paths.

Affected routes for F1/F4 (top offenders):

L1141: GET /metrics — returns Response()
L4931: GET /api/blobs/{digest:path} — returns Response()
L5430: GET /api/quality/validate — returns Response()
L5941: GET /paper_figure/{pmcid}/{fname:path} — returns RedirectResponse
L14007: GET /api/debates/{debate_id}/stream — returns StreamingResponse
L59907: GET /docs/code_health/{filename:path} — returns FileResponse
14 exchange/bid routes returning Response() without annotation
~15 redirect routes (/how, /agents, /walkthrough, etc.)

2. API Coverage — CRUD Gaps

436 /api/ routes (287 GET, 139 POST, 1 PUT, 2 PATCH, 7 DELETE). The app is overwhelmingly read-heavy with mutation via POST, not proper PUT/PATCH/DELETE.

CRUD completeness:

Entity	List	Get	Create	Update	Delete	Status
Annotations	✓	✓	✓	—	✓	Best
Blobs	✓	✓	✓	—	✓	Good
Forge chains	✓	✓	✓	—	✓	Good
API keys	✓	—	✓	—	✓	Good
Hypotheses	✓	✓	—	PATCH(desc)	—	Partial
Papers	✓	✓	POST(review)	—	—	Read-dominant
Analyses	✓	✓	POST(verify)	—	—	Read-dominant
Wiki	✓	✓	—	—	—	Read-only
Graph	✓(12)	✓	POST(trust)	—	—	Read-dominant
Debates	✓(12)	✓	POST(contribute)	—	—	Read-dominant
Artifacts	✓(17)	✓	POST(tag)	—	—	No update/delete
Notebooks	✓	✓	POST(cells)	—	—	No update/delete
Experiments	✓	✓	POST(results)	—	—	No update/delete

Priority gaps (ranked):

Wiki POST/PUT/DELETE — most-viewed content type with zero write API

Hypotheses DELETE — no way to retract a hypothesis via API

Artifacts PUT/DELETE — 23 endpoints but no update or archival

Papers/Analyses PUT — no way to amend existing records

Notebooks PUT/DELETE — cells can be added but not edited/removed

Missing list endpoints for /api/benchmarks, /api/challenges, /api/walkthroughs, /api/contributors

3. Script Fragmentation

156 root Python files + 179 in scripts/ + 164 in migrations/

Category	Count	Status
`fix_*` scripts (root)	9	Deprecated — 8/9 SQLite-only
`expand_stubs*`	8	Deprecated — all SQLite
`dedup_*`	5	Deprecated — all SQLite, 5 near-duplicate scripts
`backfill/` dir	37	Mixed — many SQLite-era
`backfill_*` (root)	6	PG-based, newer
`enrich_*`	3	Active, PG-based
`migrations/`	141 .py + 23 .sql	Mixed SQLite/PG, no tracking table

Consolidation needed:

Archive ~30 deprecated SQLite scripts to scripts/archive/
Deduplicate backfill scripts (root vs backfill/ dir)
Add migration tracking mechanism
migration_runner.py now delegates to scidex.core.migration_runner (modern path)

4. Inline HTML/CSS/JS — ~20-25% of file

~16,000-20,000 lines of inline HTML/CSS/JS within api.py
653 triple-quoted f-strings, 77 <style> blocks, 173 <script> blocks, 1,873 class="..." usages
Individual page builders: _build_hypothesis_reviews_html (~3,858 lines), _build_senate_page (~2,826 lines)
No template engine — zero Jinja2 usage, no templates/ directory
All HTML generated via inline f-strings returned as HTMLResponse

Extraction priority (top 5 builders by size):

_build_hypothesis_reviews_html — ~3,858 lines

_build_senate_page — ~2,826 lines

_render_agents_body — large

_build_concept_entity_page — large

_build_report_card_html — large

5. Data Safety — No Active Conflicts

Server running from /home/ubuntu/scidex (main checkout), not from worktrees
Coverage API reports: 397 analyses, 1,166 hypotheses, 714,201 edges, 0 orphans (analysis/hypothesis)
1,547 orphan papers, 1 broken debate FK — pre-existing, not introduced by this audit
api.py is a CRITICAL file — any refactoring must go through Orchestra sync push with review gate

Priority Actions (This Cycle)

Priority	Action	Impact	Effort
P0	Fix `/openapi.json` 500 — add `response_class` or `include_in_schema=False` to 38 non-JSON routes	Unblocks API docs, Swagger UI	Low
P0	Fix 7 shadowed routes — move static routes before parameterized ones	Unblocks `/api/markets/correlations`, `/api/artifacts/search`, etc.	Low
P0	Remove dead `/paper/{paper_id}` route at L54982	Eliminates unreachable code	Trivial
P1	Add `response_class=JSONResponse` to JSON API routes that lack it	Robustness, prevents future OpenAPI breaks	Medium
P2	Replace 28 bare `except:` with specific exceptions	Prevents silent `KeyboardInterrupt` catching	Low
P3	Archive ~30 deprecated SQLite scripts	Reduces confusion, cleans root directory	Low
P4	Document singular/plural route convention	Prevents future inconsistency	Low
P5	Remove 3 trailing-slash duplicate registrations	Reduces redundancy	Trivial
P6	Extract top 5 HTML builders to Jinja2 templates	Reduces api.py by ~11,500 lines	High

Result (Cycle 2 — 2026-04-25)

Architecture re-audit complete. api.py grew 16x since Cycle 1 (4,737 → 77,961 lines). Key new finding: /openapi.json endpoint is broken (HTTP 500) due to 38 routes with undeclared response_class. Prior critical findings (duplicate route, missing logging import) confirmed resolved. CRUD gaps persist — wiki is read-only, most entities lack PUT/DELETE. ~30 deprecated SQLite scripts should be archived. Inline HTML accounts for ~20% of the file with no template engine.

File: 7ef11ece-a3e0-4a0f-9bb7-5636bf55b991_spec.md

Modified: 2026-04-25 23:40

Size: 15.1 KB