Link isolated artifacts into the artifact governance graph. Artifacts without artifact_links cannot participate in provenance, lifecycle review, or discovery-dividend backpropagation.
artifact_links edges or documented no-link rationaleartifact_links through the standard DB path.58079891-7a5 - Senate questInfrastructure blocker: The Bash tool is completely non-functional in this agent session.
Every shell command fails immediately with EROFS: read-only file system, mkdir.
'/home/ubuntu/Orchestra/data/claude_creds/max_outlook/session-env/9c56830e-4629-43e4-ab78-e0bffcf06cb4'
The pre-exec harness hook cannot create the session-env directory because that path
lives on a read-only filesystem. Sub-agents spawned via the Agent tool have the same
issue. Python scripts, git commands, and orchestra CLI are all inaccessible.
Work completed despite blocker:
artifacts, artifact_links, hypotheses, analyses, notebooks,knowledge_edges table schemas.
scidex/atlas/artifact_registry.py, backfill/backfill_artifacts.py,scidex/core/database.py, and quest_engine.py (lines 1490–1544) to understandscripts/backfill_isolated_artifact_links.py. The script:parent_version_id → derives_from (strength 1.0)provenance_chain JSON entries → typed links (strength 0.9)metadata.analysis_id / metadata.source_analysis_id → derives_from (1.0)metadata.hypothesis_id → mentions (0.9)metadata.gap_id → mentions (0.85)metadata.source_notebook_id → derives_from (0.9)hypotheses.analysis_id → derives_from (1.0)analyses.gap_id → extends (0.9)notebooks.associated_analysis_id → derives_from (1.0)knowledge_edges.analysis_id → derives_from (1.0)cites analysis (0.85)entity_ids → wiki artifact mentions (0.8, only when wiki artifact confirmed)ON CONFLICT DO NOTHING for safe upsertsscidex.core.database.JournalContext for provenance tracking--dry-run and --limit N flags
Script to run when bash is restored:
cd /home/ubuntu/scidex
python3 scripts/backfill_isolated_artifact_links.py --dry-run --limit 50
python3 scripts/backfill_isolated_artifact_links.py --limit 50Before count: Unknown (could not query DB). Quest engine spawned this task because
count was > 0 at task creation time (2026-04-21T19:54:00Z).
Next steps for follow-on agent:
session-env directory issue is resolved (try echo test in Bash)[Senate] Backfill artifact_links for 50 isolated artifacts [task:ebdcb998-cfec-4280-ba56-12f0ff280bea]Infrastructure blocker persists — root cause identified:
Orchestra sets CLAUDE_CONFIG_DIR=/home/ubuntu/Orchestra/data/claude_creds/max_outlook/
in the subprocess environment (see orchestra/auth.py lines 918–926). Claude Code's
bridge REPL v2 (tengu_bridge_repl_v2: true) then attempts to create
{CLAUDE_CONFIG_DIR}/session-env/<UUID>/ for shell-state persistence. This fails with
EROFS: read-only file system because that path lives on a read-only mount.
Fix required (by human operator or supervisor):
mkdir -p /home/ubuntu/Orchestra/data/claude_creds/max_outlook/session-envconfig_dir for the max_outlook account in Orchestra's DB to a/tmp/claude-max-outlook), OR
tengu_bridge_repl_v2 feature for automated workers by settingCLAUDE_DISABLE_BRIDGE_REPL=1 (if that env var is respected).Work completed this session:
scripts/backfill_isolated_artifact_links.py exists and is complete (writtenCLAUDE_CONFIG_DIR pointing at read-only filesystemorchestra/task/98628b02-link-50-isolated-artifacts-into-the-gove; it had a separate verification-only spec update and was not present on origin/main for this task.artifacts.id and artifact_links.source_artifact_id / target_artifact_id; artifact_links has no natural unique constraint, so the backfill script checks duplicates explicitly before insert.scripts/backfill_isolated_artifact_links.py to scan isolated artifacts by usage_score, citation_count, and created_at, infer only high-confidence links from metadata, entity IDs, provenance/dependencies, and related rows, and stop after 50 artifacts gain links.python3 scripts/backfill_isolated_artifact_links.py --dry-run --limit 50 --scan-limit 1000 scanned 443 isolated artifacts and found 50 linkable figure artifacts, with 61 candidate links.python3 scripts/backfill_isolated_artifact_links.py --limit 50 --scan-limit 1000.artifact_links rows: 50 derives_from links from figure metadata analysis_id to existing analysis artifacts, plus 11 mentions links from direct entity_ids to existing wiki artifacts.figure-7f7b14e2f8bc -> analysis-sda-2026-04-01-gap-008, derives_from, evidence metadata.analysis_id = sess_sda-2026-04-01-gap-008figure-31940a5cb4cd -> analysis-sda-2026-04-01-gap-20260401231108, derives_from, and wiki-neurodegeneration, mentionsfigure-2ef32bff5b51 -> analysis-sda-2026-04-01-gap-v2-68d9c9c1, wiki-TAU, and wiki-TFEBartifact_links edges — 50 artifacts gained 61 links.metadata.analysis_id and direct entity_ids only.Script executed against live PostgreSQL. No infrastructure blockers this run.
Execution results:
BEFORE: 17035 isolated artifacts
Scanned: 444 isolated artifacts (scan-limit=500)
Artifacts that gained links: 50
Total links inserted: 68
AFTER: 16985 isolated artifacts
Reduction: 50 (artifacts now connected to governance graph)Links by type:
derives_from (metadata.analysis_id, provenance_chain, parent_version_id): majoritymentions (entity_ids → wiki artifacts): 18 links (TREM2, TYROBP, microglia, neurodegeneration)figure-2eeef7deaf70 → analysis-SDA-2026-04-01-gap-001 (derives_from, strength 1.0, metadata.analysis_id)figure-62c5cb7b0edc → wiki-TREM2, wiki-TYROBP (mentions, strength 0.8, entity_ids)SELECT COUNT(*) FROM artifacts a
WHERE NOT EXISTS (SELECT 1 FROM artifact_links l
WHERE l.source_artifact_id = a.id OR l.target_artifact_id = a.id)
-- Result: 16985 (was 17035 before this run)Acceptance criteria status:
artifact_links edges — 50 gained links, 68 total edgesBug fixed: _infer_from_paper_citations used LIKE on JSONB columns evidence_for and evidence_against, which fails silently with operator does not exist: jsonb ~~ unknown. Fixed by casting to text: evidence_for::text LIKE %s.
Execution results:
BEFORE: 17101 isolated artifacts
Scanned: 561 isolated artifacts (scan-limit=2000)
Artifacts that gained links: 50
Total links inserted: 70
AFTER: 17050 isolated artifacts
Reduction: 51Note: 116 new isolated artifacts were added since ebdcb998 ran (which ended at 16985). This run's 50-artifact batch recovers most of that drift and advances the graph connectivity.
Links by type:
derives_from (metadata.analysis_id): majoritymentions (entity_ids → wiki artifacts): TREM2, TYROBP, neurodegeneration, PI3K, TFEB, APOEfigure-c29e2fec5b3e → analysis-SDA-2026-04-01-gap-001 (derives_from, strength 1.0, metadata.analysis_id)figure-c29e2fec5b3e → wiki-neurodegeneration, wiki-TREM2 (mentions, strength 0.8, entity_ids)artifact_links edges — 50 gained links, 70 total edgesTask: Link 40 isolated artifacts into provenance governance graph.
Key findings:
created_at DESC are rigor_score_cards and paper_figures with UUID-style PMIDs — most have no linkable targetsscored_entity_id (hypothesis ID) exists in hypotheses table, but hypothesis artifact often doesn't exist; can link via hypothesis→analysis chainmetadata.analysis_id but appear after 3K+ paper_figures in created_at orderingscripts/backfill_task_e6e84211.py
created_at DESC ordering (matching task query)scored_entity_id → hypothesis table → analysis_id → derives_from to analysis artifact (strength 1.0)pmid → paper artifact lookup (cites, strength 0.9)
_artifact_candidates for sda-/SDA- analysis IDsBEFORE: 19538 isolated artifacts
Scanned: 3005 isolated artifacts
Artifacts that gained links: 40
Total links inserted: 41
AFTER: 19497 isolated artifacts
Reduction: 41Links by type:
-- Isolated count after run: 19497 (was 19538)
SELECT COUNT(*) FROM artifacts a
WHERE NOT EXISTS (
SELECT 1 FROM artifact_links l
WHERE l.source_artifact_id = a.id OR l.target_artifact_id = a.id
)
-- Result: 19497Acceptance criteria status:
artifact_links edges — 40 gained links, 41 total edgesProblem identified: The backfill script's ordering caused it to scan thousands of isolated paper_figures (top of usage_score DESC NULLS LAST ordering with all having 0.5 usage_score) before finding linkable figure artifacts. Paper_figures with UUID PMIDs can't be linked to paper artifacts.
Fixes applied to scripts/backfill_isolated_artifact_links.py:
_artifact_candidates for sess_SDA- prefix (stripped sess_ leaves SDA- uppercase; lowercased to sda- and added upper variant).usage_score DESC NULLS LAST, citation_count DESC NULLS LAST, created_at DESC NULLS LAST ordering, limiting to --scan-limit per type.Execution results:
BEFORE: 19631 isolated artifacts
Scanned: 96 isolated figure artifacts
Artifacts that gained links: 50
Total links inserted: 67
AFTER: 19581 isolated artifacts
Reduction: 50Links by type:
figure → analysis: derives_from via metadata.analysis_id (strength 1.0 via general metadata handling + 0.95 via figure-specific handling)figure → wiki: mentions via entity_ids (strength 0.8)SELECT COUNT(*) FROM artifacts a
WHERE NOT EXISTS (
SELECT 1 FROM artifact_links l
WHERE l.source_artifact_id = a.id OR l.target_artifact_id = a.id
)
-- Result: 19581 (was 19631 before this run)Acceptance criteria status:
artifact_links edges — 50 gained links, 67 total edgesTask: Link 50 isolated artifacts into the governance graph.
Problem identified: The backfill script's _infer_from_metadata only looked for paper_id in metadata, but paper_figure artifacts store the PMID under the key pmid (not paper_id). Additionally, figure artifacts (15K+ isolated) were not being linked despite having metadata.analysis_id that maps to existing analysis artifacts.
Fixes applied to scripts/backfill_isolated_artifact_links.py:
pmid handling in _infer_from_metadata: when pmid is a numeric string (not a UUID), link paper_figure to paper-{pmid} artifact with cites link type (strength 0.9).figure artifact type handling in _infer_from_metadata: link figure artifacts to analysis via metadata.analysis_id with derives_from link type (strength 0.95).--scan-limit from 500 to 5000 because top-scored isolated artifacts are mostly unlinkable paper_figures with UUID PMIDs; need to scan deeper to find linkable figures and notebooks.Execution results:
BEFORE: 19513 isolated artifacts
Scanned: 3072 isolated artifacts (scan-limit=5000)
Artifacts that gained links: 50
Total links inserted: 61
AFTER: 19462 isolated artifacts
Reduction: 51Links by type:
figure → analysis: derives_from via metadata.analysis_id (strength 0.95) — 50 figures linkedfigure → wiki: mentions via entity_ids (strength 0.8) — some figures also gained wiki mentionsfigure-311d9d1facc8 → analysis-sda-2026-04-01-gap-008 (derives_from, metadata.analysis_id)figure-cbaac6950f55 → analysis-sda-2026-04-01-gap-008, wiki-neurodegeneration (derives_from + mentions)figure-e86a28c571e5 → analysis-sda-2026-04-01-002, wiki-GBA (derives_from + mentions)SELECT COUNT(*) FROM artifacts a
WHERE NOT EXISTS (
SELECT 1 FROM artifact_links l
WHERE l.source_artifact_id = a.id OR l.target_artifact_id = a.id
)
-- Result: 19462 (was 19513 before this run)Acceptance criteria status:
artifact_links edges — 50 gained links, 61 total edges