> ## Continuous-process anchor
>
> This spec describes an instance of one of the retired-script themes
> documented in docs/design/retired_scripts_patterns.md. Before
> implementing, read:
>
> 1. The "Design principles for continuous processes" section of that
> atlas — every principle is load-bearing. In particular:
> - LLMs for semantic judgment; rules for syntactic validation.
> - Gap-predicate driven, not calendar-driven.
> - Idempotent + version-stamped + observable.
> - No hardcoded entity lists, keyword lists, or canonical-name tables.
> - Three surfaces: FastAPI + orchestra + MCP.
> - Progressive improvement via outcome-feedback loop.
> 2. The theme entry in the atlas matching this task's capability:
> A4 (pick the closest from Atlas A1–A7, Agora AG1–AG5,
> Exchange EX1–EX4, Forge F1–F2, Senate S1–S8, Cross-cutting X1–X2).
> 3. If the theme is not yet rebuilt as a continuous process, follow
> docs/planning/specs/rebuild_theme_template_spec.md to scaffold it
> BEFORE doing the per-instance work.
>
> **Specific scripts named below in this spec are retired and must not
> be rebuilt as one-offs.** Implement (or extend) the corresponding
> continuous process instead.
Task ID: d20e0e93-fdbc-4487-b99e-0132b3e31684 Priority: 50 Type: recurring (daily) Layer: Atlas
Find wiki_pages that have a kg_node_id set but no corresponding entry in node_wiki_links, and create the missing links. Also attempt to match wiki pages lacking kg_node_id to KG entities via title/slug patterns.
kg_node_id set have a corresponding node_wiki_links(kg_node_id, title) entrykg_node_id are matched to KG nodes where possible via title/slug heuristicsTables involved:
wiki_pages: has kg_node_id column (set for ~13K of 17K pages)node_wiki_links(kg_node_id, wiki_entity_name): join table for KG node ↔ wiki page linksknowledge_edges(source_id, target_id, ...): defines KG node IDskg_node_id set but no matching node_wiki_links entrykg_node_id (excluding index/navigation pages)Run crosslink_wiki_remaining.py (or equivalent inline script) that:
Starting: Re-checked the PostgreSQL wiki/KG link backlog for this recurring run.
Found:
kg_node_id; 906 have no kg_node_id.kg_node_id are missing exact node_wiki_links(kg_node_id, slug) rows.kg_node_id (entities-parkinson, phenotypes-neurodegeneration) already have exact existing node_wiki_links that can safely promote into wiki_pages.kg_node_id.scripts/ci_crosslink_wiki_kg.py Step 2 to promote only exact existing node-wiki links for candidate scientific pages, including phenotype and generic entity pages, then run the driver and verify idempotency.Action: Updated scripts/ci_crosslink_wiki_kg.py so Step 1 verifies exact (kg_node_id, slug) joins and Step 2 promotes only exact existing node_wiki_links into wiki_pages.kg_node_id. Ran the driver and updated 2 pages: entities-parkinson -> parkinson, phenotypes-neurodegeneration -> neurodegeneration.
Verification: Re-ran python3 scripts/ci_crosslink_wiki_kg.py; second run reported step1=0 step2=0. Verified 0 exact missing node-wiki links, 16,671 wiki_pages with kg_node_id, 904 without, and 0 non-skipped unmatched pages.
Starting: Investigating state of wiki_pages and node_wiki_links.
Found:
ent-gene-apoe -> APOE Geneent-gene-mapt -> MAPT GeneSlug-pattern matching: Ran script to match wiki pages without kg_node_id to KG entities via title/slug heuristics. Found 0 high-confidence matches - the remaining 1737 wiki pages don't have canonical entity pages that match KG entities. Gene pages without kg_node_id (MITF Gene, ST6GALNAC5 Gene, TFEC Gene) don't correspond to any KG entity.
Created: scripts/crosslink_wiki_to_kg.py for future matching.
Result: Done — Primary acceptance criteria met. All wiki_pages with kg_node_id have corresponding node_wiki_links.
Verification: 0 wiki_pages with kg_node_id missing node_wiki_links (17,346 total pages, 3,836 without kg_node_id).
Runs executed:
crosslink_wiki_to_kg.py: 16 new node_wiki_links (35,016 → 35,032)crosslink_wiki_all.py: 13,911 new artifact_links (1,220,543 → 1,234,454), unlinked pages 85 → 62crosslink_wiki_hypotheses.py: 1,905 new links/edges (381 hypothesis, 1,036 analysis, 407 KG edges, 81 entity-ID)Verification: 0 wiki_pages with kg_node_id missing node_wiki_links (17,435 total, 15,960 with kg_node_id, 1,475 without).
Runs executed:
crosslink_wiki_to_kg.py: 0 new node_wiki_links (49,413 stable)crosslink_wiki_all.py: blocked by pre-existing DB corruption in artifacts table (sqlite database disk image is malformed) — unrelated to node_wiki_linksartifacts table (multiple bad indexes, tree page reference errors) affects crosslink_wiki_all.py experiment crosslinking but does NOT affect node_wiki_links or wiki_pages tables which are intact.Result: All acceptance criteria met. Primary task (node_wiki_links for kg_node_id pages) is satisfied.
Verification: 0 wiki_pages with kg_node_id missing node_wiki_links (17,538 total, 15,991 with kg_node_id, 1,547 without).
Runs executed:
crosslink_wiki_to_kg.py: 0 new node_wiki_links (50,653 stable)crosslink_wiki_all.py: 4,382 new artifact_links (72 hypothesis, 1,405 analysis, 2,784 experiment, 121 KG-edge)Verification: 0 wiki_pages with kg_node_id missing node_wiki_links (17,538 total, 15,991 with kg_node_id, 1,547 without).
Runs executed:
crosslink_wiki_to_kg.py: 0 new node_wiki_links (50,653 stable)crosslink_wiki_all.py: 146 new artifact_links (0 hypothesis, 146 analysis, 0 experiment, 0 KG-edge)crosslink_wiki_hypotheses.py: 10 new links/edges (0 hypothesis, 10 analysis, 0 KG edges, 0 entity-ID)Verification: 0 wiki_pages with kg_node_id missing node_wiki_links (17,539 total, 16,637 with kg_node_id, 861 without).
Runs executed:
ci_crosslink_wiki_kg.py: 0 new node_wiki_links (52,966 stable)Verification: 0 wiki_pages with kg_node_id missing node_wiki_links (17,539 total, 16,637 with kg_node_id, 902 without).
Runs executed:
crosslink_wiki_to_kg.py: 0 new node_wiki_links (54,416 stable)Verification: 0 wiki_pages with kg_node_id missing node_wiki_links (17,539 total, 16,637 with kg_node_id, 861 without).
Runs executed:
ci_crosslink_wiki_kg.py: 0 new node_wiki_links (54,416 stable)Verification: 0 wiki_pages with kg_node_id missing node_wiki_links (17,539 total, 16,637 with kg_node_id, 902 without).
Runs executed:
crosslink_wiki_all.py: 64 new artifact_links (57 hypothesis, 7 KG-edge via analysis)Verification: 0 wiki_pages with kg_node_id missing node_wiki_links after run (16,664 with kg_node_id, DB corruption prevents total count).
Runs executed:
ci_crosslink_wiki_kg.py: 2 new node_wiki_links (54,467 → 54,469)knowledge_edges table (malformed)wiki_pages table still has corruption from prior runs (total count fails), but node_wiki_links reads/writes succeed. artifact_entity_crosslink.py fails on corrupted artifacts table.Result: 2 new links created. All pages with kg_node_id now have node_wiki_links (0 unlinked).
Starting state: 17,574 total wiki_pages, 16,663 with kg_node_id, 911 without. 5 pages with kg_node_id missing node_wiki_links.
Actions:
Result: 0 pages with kg_node_id missing node_wiki_links. All acceptance criteria met.
{
"requirements": {
"coding": 7,
"reasoning": 6,
"analysis": 6,
"safety": 6
},
"auto_tagged_at": "2026-04-03T22:29:52.509512",
"completion_shas": [
"0eae75275f62f86d9fe7c30c09ca127a522e6e6a"
],
"completion_shas_checked_at": "2026-04-13T11:07:04.444613+00:00",
"completion_shas_missing": [
"24b32a57363c9c7da6ec5167635d0eee72a6b880",
"9384a6400019ad8c350484ea7e205fc82f717f98",
"4b894f28cbdd836172db803736f54c27ad1fdc6c",
"5f82390774d9c0baef58ede30fbdde149409bb3e",
"6bb842ab3620e56a2cc64ee5c49d5318c60dd835",
"361ca8e7a74cef16376fbb2bae318f0a00203ca9",
"6429e830f65160a38c578cf5ec64f1d9a8d859fe",
"4107afc2cd7b8ce8b6da07fdbd02732cfb73a6a0",
"26d9368f52c539889ca9a9b66e529b2bb01c36ba",
"71439088c612f19cdd7bcecfe316b357486cbe13",
"e742b1fa029c8ff707b8236457401dfdf17bd4c6",
"fdc9fa50dda4da7a1106f509717e9a2bc1eba963",
"54d7d5c1fbadd5c1533e918f5b0470809f2b8d91",
"b97795747c87549108d9502c50baf6c5c01f0f0b",
"971dbbd3f9e581965bdc63e765bebd2200ce61cd",
"c6ef139ee964ad19aacf87a4dd137fe7a4643f09",
"fa9b3cd68146dbe4a07c240146da5e66dbb75567",
"f1ad344481a4efeae3aaa6aca0eadc4df41f0790",
"90a8119ec7970b03b047c254a0239bf586ef5f83",
"9ad996edc40c4e7195d1d644286d39e5d7e4986c",
"5326bb689e3ba7d5063085d8e6f9f9075d851ce4"
]
}