Goal
Bring thin neurodegeneration-relevant experiment records up to a usable Exchange quality bar by expanding protocol text, expected outcomes, and success criteria, and by attaching missing parent hypothesis links where the scientific connection is defensible. The task should improve what users see on
/experiments and experiment detail pages without changing the page code itself. The current repository state differs from the original task framing, so the work will target the current PostgreSQL-backed experiment corpus rather than the old 188-row snapshot.
Acceptance Criteria
☑ Current state audited for stale assumptions and remaining thin experiment records
☑ PostgreSQL-safe enrichment script added under scripts/
☑ A targeted batch of neurodegeneration-relevant experiments updated with richer protocols, outcomes, and criteria
☑ Missing hypothesis links added where the parent relationship is supported by target, disease, or title context
☑ Updated records verified via SQL and at least one page/API rendering check
Approach
Audit current experiment counts and identify the highest-value incomplete neurodegeneration records.
Add a current enrichment script that works against scidex.core.database.get_db() and the repo-root llm.py abstraction.
Run a small batch, inspect generated content quality, then continue with a larger targeted pass.
Verify the updated records in PostgreSQL and confirm that experiment pages still render successfully.Dependencies
Dependents
Work Log
2026-04-25 23:29 PDT — Codex
- Performed staleness review against current main (
28f900c8f) and current DB state.
- Confirmed the task is still necessary: the experiment corpus is now 632 rows, with 315 thin protocols and 165 missing hypothesis links; prior task commit
a8007ddc1 only enriched the top 5 records.
- Found the archived helper
scripts/archive/oneoff_scripts/enrich_experiment_descriptions.py is not usable on current HEAD: it assumes SQLite-era import/layout and fails immediately with ModuleNotFoundError: No module named 'llm'.
- Decided to create a new PostgreSQL-safe enrichment script under
scripts/ and target neurodegeneration-relevant experiments where the work is most aligned with SciDEX's mission and visible on Exchange pages.
2026-04-26 00:07 PDT — Codex
- Added
scripts/enrich_experiment_descriptions.py, a PostgreSQL-safe enrichment utility with neurodegeneration filtering, current DB access, JSON parsing guards, and deterministic fallback text generation.
- Applied a targeted live batch update to 12 top incomplete neurodegeneration experiments using the deterministic fallback path plus heuristic parent-hypothesis linking.
- Post-update counts for neurodegeneration experiments improved from 62 → 44 thin protocols, 55 → 39 thin expected-outcomes fields, 59 → 38 thin success-criteria fields, and 10 → 7 missing hypothesis-link sets.
- Verified sample updated records in PostgreSQL and confirmed
/experiment/exp-e9c371ae-4aea-46c9-b21f-946ea6c42bd7 and /experiment/exp-6815b60b-6325-4c88-b293-ef6936222780 both return HTTP 200 and render Protocol / Expected Outcomes / Success Criteria sections.
2026-04-26 08:12 PDT — Codex
- Re-verified the current live neurodegeneration subset with
python3 scripts/enrich_experiment_descriptions.py --stats: 286 neurodegeneration-relevant experiments remain, with 44 thin protocols, 39 thin expected-outcomes fields, 38 thin success-criteria fields, and 7 missing hypothesis-link sets after the applied batch.
- Re-checked representative updated records directly in PostgreSQL:
exp-e9c371ae-4aea-46c9-b21f-946ea6c42bd7 now has protocol/outcome/criteria lengths 1026/450/447 and five linked hypotheses; exp-6815b60b-6325-4c88-b293-ef6936222780 now has lengths 414/527/466 and three linked hypotheses.
- Re-verified Exchange rendering with HTTP 200 responses for both experiment detail pages and confirmed the rendered HTML still includes Protocol, Expected Outcomes, and Success Criteria sections.
2026-04-26 09:03 PDT — Codex
- Patched
scripts/enrich_experiment_descriptions.py to support --skip-llm deterministic execution and heuristic parent-hypothesis scoring so the batch can complete even when LLM providers are rate-limited.
- Applied an additional live batch update to 20 neurodegeneration-relevant experiments, including several rich-but-unlinked records such as the TRIM21 stress-granule and autophagy-receptor experiments.
- Post-update neurodegeneration stats improved from
44/39/38/7 thin-protocol/thin-outcomes/thin-criteria/missing-link counts to 27/25/23/3.
- Re-verified rendered Exchange pages for
exp-a3090cf0-854f-45dc-8ef7-06cf9a6bb754 and exp-b6c4a13e-3b7b-41fa-9e4d-eacfb11bb61f; both return HTTP 200 and include Protocol, Expected Outcomes, and Success Criteria sections.