SciDEX — Task: [Forge] Artifact enrichment quest

Enrich model artifacts with evaluation dataset/benchmark info. Cross-link artifacts sharing entities via artifact_links. Backfill provenance chains. See spec for phases.

Completion Notes

Auto-release: recurring task had no work this cycle

Git Commits (20)

Squash merge: orchestra/task/fbb838fb-artifact-enrichment-quest-evaluation-con (2 commits)2026-04-16

[Forge] Artifact enrichment cycle: 34 entity-overlap links, update work log [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-12

[Forge] Artifact enrichment quest run 2026-04-12 [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-12

[Forge] Artifact enrichment quest — run 2026-04-12 06:57 UTC [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-12

[Forge] Artifact enrichment quest — run 2026-04-12 06:57 UTC [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] Artifact enrichment quest: practical limit confirmed at 88.8% coverage [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] Log artifact enrichment quest run; mark criteria complete; fix prior push block [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] Artifact enrichment quest: update acceptance criteria and work log [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] Artifact enrichment quest: Phase 4 paper mesh-term linking + lock fixes [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] Artifact enrichment driver: evaluated_on, cross-links, provenance [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] artifact_enrichment_quest: 18:15 run, steady-state confirmed, 96.2% non-figure coverage [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] artifact_enrichment_quest: steady-state verification, 88.3% coverage confirmed [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] artifact_enrichment_quest: update work log, 88.3% coverage confirmed [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] artifact_enrichment_quest: update work log, 95.8% non-figure coverage confirmed [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] artifact_enrichment_quest: 96.2% coverage, entity cross-link at limit [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] Update artifact_enrichment_quest spec: 96.2% coverage, token expansion fix [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] artifact_entity_crosslink: fix nested array + case-insensitive Jaccard [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] artifact_enrichment_quest: verify at limit, push blocked by repo rule [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-11

[Forge] Artifact enrichment quest work log update [task:fbb838fb-e5aa-4515-8f30-00959622ce98]2026-04-10

Spec File

[Forge] Artifact Enrichment Quest

> ## Continuous-process anchor
>
> This spec describes an instance of one of the retired-script themes
> documented in docs/design/retired_scripts_patterns.md. Before
> implementing, read:
>
> 1. The "Design principles for continuous processes" section of that
> atlas — every principle is load-bearing. In particular:
> - LLMs for semantic judgment; rules for syntactic validation.
> - Gap-predicate driven, not calendar-driven.
> - Idempotent + version-stamped + observable.
> - No hardcoded entity lists, keyword lists, or canonical-name tables.
> - Three surfaces: FastAPI + orchestra + MCP.
> - Progressive improvement via outcome-feedback loop.
> 2. The theme entry in the atlas matching this task's capability:
> AG1, A4 (pick the closest from Atlas A1–A7, Agora AG1–AG5,
> Exchange EX1–EX4, Forge F1–F2, Senate S1–S8, Cross-cutting X1–X2).
> 3. If the theme is not yet rebuilt as a continuous process, follow
> docs/planning/specs/rebuild_theme_template_spec.md to scaffold it
> BEFORE doing the per-instance work.
>
> **Specific scripts named below in this spec are retired and must not
> be rebuilt as one-offs.** Implement (or extend) the corresponding
> continuous process instead.

Goal

Ensure every artifact in SciDEX has rich context: what it was tested on (models), who uses it (cross-links), where it came from (provenance), and how it relates to other artifacts. Currently 44K+ artifacts exist but many have sparse metadata and zero artifact_links connections.

Design Goals

1. Model artifacts must specify evaluation context

Every model artifact should have in its metadata:

evaluation_dataset — what dataset was used for evaluation (e.g. "SEA-AD Snyder et al. 2022")
training_data — what it was trained on
benchmark_id / benchmark_name — link to a benchmark if applicable

The artifact detail page already renders these fields when present (added 2026-04-05). Models without evaluation context show a warning: "Evaluation dataset not specified — metrics shown without benchmark context."

2. Cross-linking via artifact_links

Every artifact should have at least one artifact_link connection. Link types:

supports / contradicts — evidence relationships
derives_from — versioning/derivation chain
cites — paper citations
evaluated_on — model → benchmark/dataset
related / mentions / see_also — softer connections

Current state: 1.8M links exist but unevenly distributed — many artifacts (especially models, datasets, dashboards) have 0 links.

3. Provenance and "used by"

The artifact detail page shows "Linked Artifacts" when links exist. When they don't, the page is missing the relationship web that gives artifacts context. The quest should:

For each unlinked artifact, find related analyses/hypotheses/papers via entity overlap
Create artifact_links with appropriate link_type and strength

Acceptance Criteria

☑ All 7 model artifacts have evaluation_dataset in metadata

☑ All model artifacts with evaluation_metrics link to at least one benchmark or dataset

☑ Artifacts with shared entity_ids are cross-linked via artifact_links (practical limit reached — remaining unlinked lack entity_ids)

☑ Model pages render EVALUATION CONTEXT section (already done for enriched models)

☑ "Linked Artifacts" section populated on >80% of artifact detail pages (88.5% coverage)

Approach

Phase 1: Model enrichment (one-shot tasks)

For each model artifact missing evaluation_dataset:

Read the model's metadata to understand what it does

Find the most likely training/evaluation dataset from the entity context

Update metadata with evaluation_dataset, training_data, benchmark references

Create artifact_links to related datasets/benchmarks

Phase 2: Entity-overlap cross-linking (batch)

Script: for each artifact with entity_ids, find other artifacts sharing entities and create related links with strength proportional to overlap.

Phase 3: Provenance chain backfill

For artifacts with created_by pointing to an analysis/pipeline:

Find the analysis that created it

Create derives_from link

Populate the provenance_chain JSON field

Work Log

2026-04-10 — Artifact enrichment run

All 7 model artifacts already have evaluation_dataset in metadata ✓
2/3 model artifacts with evaluation_metrics link to benchmark/dataset ✓
Microglial-Amyloid-Cytokine Model v1 has evaluation_metrics but no external benchmark (in silico only) — not actionable
Ran entity_overlap cross-link script: created 55 new related links for unlinked artifacts
96.4% of non-figure artifacts now have ≥1 link (up from 71.2%)
Remaining unlinked (985): mostly papers (460) and notebooks (169) without entity_ids
Provenance backfill: notebooks without derives_from lack identifiable provenance in metadata — no actionable items
Added scripts: scripts/artifact_entity_crosslink.py, scripts/artifact_provenance_backfill.py
Committed and pushed

2026-04-10 18:45 PT — minimax:50

Ran entity crosslink script: created 80 new links
Coverage: 96.4% (26,541/27,526 non-figure/paper_figure artifacts linked)
Remaining unlinked (985): papers (460, no entity_ids), wiki_pages (338, unique entities), notebooks (169, no entity_ids), others
All 387 unlinked artifacts with entity_ids have unique entities not shared with any other artifact — entity cross-linking is at practical limit
Verified: all 7 model artifacts have evaluation_dataset in metadata

2026-04-10 08:27 PT — minimax:56

Verified all 7 model artifacts have evaluation_dataset in metadata (Phase 1 ✓)
All 3 models with evaluation_metrics have evaluated_on links (acceptance criteria met)
Added evaluated_on link for Microglial-Amyloid-Cytokine Model v1 → TREM2 Expression dataset
Created enrichment/enrich_artifact_entity_crosslinks.py (Phase 2 script)
First run: 49 unlinked notebooks with entities processed, 156 links computed, 3 new unique links inserted
Linked artifact coverage: 71.7% (26719/37260), up from 71.2%
Committed script and pushed (commit: 4d8183ca)erified all 7 model artifacts have evaluation_dataset in metadata (Phase 1 ✓)
All 3 models with evaluation_metrics have evaluated_on links (acceptance criteria met)
Added evaluated_on link for Microglial-Amyloid-Cytokine Model v1 → TREM2 Expression dataset
Created enrichment/enrich_artifact_entity_crosslinks.py (Phase 2 script)
First run: 49 unlinked notebooks with entities processed, 156 links computed, 3 new unique links inserted
Linked artifact coverage: 71.7% (26719/37260), up from 71.2%
Committed script and pushed (commit: 4d8183ca)

2026-04-05 — Manual

Added EVALUATION CONTEXT template to artifact detail page for models (api.py)
Enriched Neurodegeneration Risk Predictor model with SEA-AD evaluation context
Created artifact_links for that model (evaluated_on benchmark, supports hypotheses)
Created this spec for the broader quest

2026-04-11 19:15 UTC — forge-ae-v4

Artifact Enrichment Quest run: all phases executed
All 7 model artifacts have evaluation_dataset in metadata ✓
All model artifact_links already created (no duplicates) ✓
Entity-overlap cross-linking: 0 new links (practical limit — all entity-sharing artifacts already linked) ✓
Provenance backfill: 0 items (no created_by/provenance_chain fields actionable) ✓
Coverage: 88.5% (33,289/37,596 artifacts have ≥1 link) — above 80% target ✓
Remaining unlinked (4,307): papers, wiki_pages, notebooks without entity_ids — no actionable items
Quest is at practical limit: all enrichment phases complete

2026-04-12 13:19 UTC — minimax:57

Artifact Enrichment Quest run: all phases executed
All 7 model artifacts have evaluation_dataset in metadata ✓
All model artifact_links already created (no duplicates) ✓
Entity-overlap cross-linking: 0 new links (practical limit) ✓
Provenance backfill: 0 items (practical limit) ✓
Coverage: 88.7% (33,396/37,632 artifacts have ≥1 link) — above 80% target ✓
Remaining unlinked (4,236): papers (505), wiki_pages (384), experiment (20), notebook (19), dashboard (5), dataset (4), protein_design (4), ai_image (3), capsule (3), hypothesis (1) — mostly lack entity_ids, no actionable items
Quest at practical limit: all phases exhausted

2026-04-11 19:27 UTC — forge-ae-v5

Artifact Enrichment Quest run: all phases executed
All 7 model artifacts have evaluation_dataset in metadata ✓
All model artifact_links already created (no duplicates) ✓
Entity-overlap cross-linking: 0 new links (practical limit) ✓
Provenance backfill: 0 items (practical limit) ✓
Coverage: 88.5% (33,289/37,596) — above 80% target ✓
Remaining unlinked (4,307): papers/wiki_pages/notebooks without entity_ids — no actionable items
Quest at practical limit: all phases exhausted
Clean branch from origin/main (no merge commit) — resolved prior push rejection

2026-04-16 23:22 UTC — minimax:71

Found 1 model artifact missing evaluation_dataset: model-biophys-microglia-001 (Microglial Activation ODE Model — TREM2/APOE/IL-6 Signaling Network)
Enriched model metadata: evaluation_dataset, training_data, benchmark_name, benchmark_notes
Created 3 artifact_links for the model: evaluated_on → SEA-AD MTG snRNA-seq dataset, related → SEA-AD Single Cell Dataset, related → TREM2 Ectodomain Variant protein_design
All 8 model artifacts now have evaluation_dataset ✓
Coverage: 87.2% (33,398/38,318) — above 80% target ✓
Committed and pushed (commit: 02f24b5df)

Payload JSON

{
  "requirements": {
    "coding": 7,
    "reasoning": 6,
    "analysis": 5
  },
  "completion_shas": [
    "ac2589cf6127da1bba2da4f9f4a0212e3116a356"
  ],
  "completion_shas_checked_at": "2026-04-12T22:28:53.881654+00:00",
  "completion_shas_missing": [
    "b609f1b0e3d1f2e667d83aabe0efb02b1cf87ce4",
    "30f8b7fa35b9d09a70f0c19378d0dbd53f1097c6",
    "b8860b48d168bcfdd84852ac1cb860898acb99c5",
    "44816193fb75172e644390d5f1879bf63fcc69df",
    "57e189e04a7bfb14c351721961e65dd3b503f340",
    "d9b448014738858c336437805cf15b2f9ee97c21",
    "bff21933698644c2d7070a354a410521e1418b4a",
    "ba696ac54979bddde3481296cd20928a3b30963d",
    "df639313cebd864d00c481adc5cdcfb63979ab4b",
    "33e45e81fdc02eca329719e4108cae0b518552cc",
    "77e783faf64ea17f4502aed20ebb2d5214736805",
    "bea7f4061c7d1d2e9fea96fb192cb78282bfcd35",
    "0a3b7168ba9feff90a81751f6376090238a66977",
    "81b74d8e127e7347aa5445e07db3ea1425c839b1",
    "a36c2cff862ce8fd5fa6a6bbff4077f952942109",
    "08fa282a218424736b624ee1337c770715be28d4",
    "d1ececd4509f83e94596689382ad6d85bf66d0b2",
    "3e462927a41e699d62f359d352c741913f576597",
    "d1b5dadd26aedda26852374f3a52b94b27df439f",
    "199b5acb936416ef17938889c6cb00a0552507bc",
    "0cc292fbb59b8136416601dca67897ca747b329b",
    "8886d487bf3d8c7265cbb3f326b6109f7001bf58",
    "ffb70a074d414e19815056197aadca2a22592ef2",
    "36306c6225d0cdb4b10156e3663f7096ab3b7204",
    "56316fb8123ca7e97bd521e689f1cd3400e3645b",
    "f8c2bc3d5cd9b32fde4eaf05090157363cc5c813",
    "19d819263ef521b2c7e73ebb9f2bea48c9f43388",
    "c0ada9d01000d4817f6c1711aeb5ee7273e396fd",
    "78eeecff934b5a7a0e8aaf022270a7c707d4949d",
    "f606a9dd12e7a15023270b17b94f40740420a0ce",
    "34ca6ac081a2442cde45c9d6116cc700de30c4fd",
    "f5830ede2628798a289b29cd0d08e94bd8bf2fef",
    "243390f0d1cee8c57cd4b53d720f8b47e77d3ab2",
    "f60c1f24a00f95500ef0d6a0c5e0588e6a015651",
    "41f877ed28473048a715e876ebaeebf1cf9f8d7a",
    "36be074ad42444dc74ecbd1642ee309d23a4f079",
    "40cf308c7f7009130049e971910812b5afce2024",
    "703d77844c9507f29cfc3d2b289d5b8e1430d0fe",
    "d36282802c641e9987dab07eadec07e2f20eb28e",
    "fdba5c9f845a868f5d9ad8c875814742a599e705",
    "585645c515eb5323de0bc1f3aa6012b97aab6f36",
    "afcc118c21b168684eff28e5f14470eb97d6a428",
    "c495a170fc2d51ac8ae185d6d7e770d9bce0c9c8",
    "39bd06763b24975fe2f334b6e3329a399192d36c",
    "f05b2ffeb14a544c252d2a4868fdea87bf1934bd",
    "fc21260454d1665677ffc5a76bb8aa85ca1ff25b",
    "ef551ba8369d2fedf2138bd0adb128edb2436421",
    "5333d2499ff4201b22100e17087db4bb060a818e",
    "693638e85d447a55db0a083affd48c3eb19ac36a",
    "69657d98db0f0f57cbff60eec77d0b603c3714d2",
    "acc3f37934252c468299facadd08f6c294801338",
    "21e7bf6803f0cc3b2e7f6c359bb837cdc6fcc6a3",
    "ff7c61371ad8ddc8fb5a25ee254ce64d7bfad528",
    "536d9b80c52f31e4bd11b5c31855d6c3f4718db6",
    "1f399befad7728e1ed35ff181a23e93104da2a96",
    "51dfae43dd30dca525a4e3c6d5a1b269d29b275a",
    "f5e8d228c9e1adb0dd3599ddf4592fb804fbf8cf",
    "5b6d2a5dd49caced5fbfb8212a19fabe89fa737e",
    "3a06f0a779ef124631cfc9dd27c0b5f233b79118",
    "6e4966d7ffbebbb331bcf8197edf5b788700efb6",
    "4067a6ff9648ea41d55a6471dd9c48c4a19389e2",
    "e630f95d90a7de34c9829530c24f3de446aaca3d",
    "0292d6f8f504f2e7d768beaa893c2732633f22b6",
    "ae526b9b2f35b4362ec97dfe2fac62d2f84e948a",
    "c4f578725356747e78e89c0bcdb3d437c63c3766",
    "7c82dabf5397784afa9a5e2dead135fccec9ee58",
    "b222711d97fda5328d021f63ac54c51e663e8ec0",
    "505f6a98dae40156f207e64f3b0a7b662b23fcb2",
    "530229e2fee074a0ca2e16d28c40e20d3948f2d0",
    "37151adb1087758b1052b8177d3f39cf3d8e0f91",
    "a60aea48cdc2a0174154ce471bccb4de0fbcf6e6",
    "68f7d1b3d975e8f95531cf011229dec575c58bff",
    "254d16ab22aceb962d784151a1526b3c7adde686"
  ]
}