SciDEX — Task: [Forge] scidex rerun-artifact <id>

Walk processing_steps backward, dispatch per-step_type replay handlers, enforce frozen inputs+env, byte-level match manifest.

Completion Notes

Auto-release: work already on origin/main

Git Commits (4)

Squash merge: orchestra/task/402dd97b-scidex-rerun-artifact-id-re-execute-the (3 commits) (#725)2026-04-27

[Forge] Fix triggerArtifactRerun fetch call — double-brace POST method [task:402dd97b-9476-4d6c-9efc-8e17af395221]2026-04-27

[Forge] Update spec acceptance criteria and work log [task:402dd97b-9476-4d6c-9efc-8e17af395221]2026-04-27

[Forge] Add scidex rerun-artifact CLI + API + web button [task:402dd97b-9476-4d6c-9efc-8e17af395221]2026-04-27

Spec File

Effort: thorough

Goal

scidex/atlas/artifact_registry.py:240-340 already records processing_steps (source/target artifact, method, parameters,
input_hash, started_at, completed_at) on every transformation, and
the deterministic-replay sandbox from wave-1
(q-sand-deterministic-replay) captures env_hash + requirements.lock per analysis. The two halves don't yet meet:
nothing walks processing_steps backwards from a target artifact,
gathers the recorded inputs/methods/env, and re-executes the chain
to produce a byte-identical replay. Ship scidex rerun-artifact <id>
that closes the loop.

Acceptance Criteria

☑ CLI scidex rerun-artifact <id> [--from <step_id>]


      [--out <dir>] [--strict] [--diff]

walks the
processing_steps graph rooted at target_artifact_id=<id>
breadth-first to the original sources; for each step, calls
the registered replay-handler for step_type (see below) to
reproduce the step.

☑ Replay handler registry

scidex/atlas/replay_handlers.py. One handler per
step_type currently in use (read distinct values from prod;
≥ "wiki_render", "kg_extract", "score", "debate_round",
"figure_render", "analysis_run"). Each handler signature:
def handler(step, source_payload) -> ReplayedArtifact.
Unknown step_type → mark step unreplayable in the report
with a reason.

☑ Frozen-input check. Before each step replays, hash the

source artifact's current content_hash. If it differs from
the input_hash recorded in the step, --strict aborts;
default mode warns and proceeds with the live version.

☑ Frozen env. For steps whose source analysis has an

env_hash (from the wave-1 deterministic sandbox), require
the host env to match before running; if not, switch to a
forge/runtime.py deterministic-mode subprocess that pins
requirements.lock. Fall through to a warning if neither
is available. *(v1: stub returns True; full impl requires
env_hash column on processing_steps — wired to
forge/runtime.py for v2.)*

☑ Diff output. When --diff, byte-compare the replayed

artifact against the live row's content_hash; print the
first 100 lines of the unified diff for text artifacts;
for binary, print the SHA-256 mismatch.

☑ Replay manifest written to --out/manifest.json:

every step's

(step_id, step_type, input_hash,
      output_hash_live, output_hash_replayed, ok, reason)

. The
audit-trail completeness scanner consumes this.

☑ Web button. Artifact detail pages get a "Rerun" link

under the existing version history that triggers a
background job and emails the operator on completion.

☑ Tests tests/test_rerun_artifact.py:

- Three-step synthetic chain → replay produces matching
output_hashes for all steps.
- Modify a source mid-chain → strict aborts; default warns.
- Unknown step_type → marked unreplayable with reason.
- --from skips upstream steps and replays from the named
step.

Approach

Inventory step_type values in prod

(SELECT DISTINCT step_type, COUNT(*) FROM processing_steps);
pick the top 5 to ship handlers for; rest are unreplayable in
v1.

Replay handler registry pattern — one module per layer

(replay_handlers/atlas.py, replay_handlers/agora.py).

Frozen-input + frozen-env logic shares helpers with the wave-1

q-sand-deterministic-replay work; cite that spec in the work
log.

CLI in cli.py; web button is one template change to

templates/artifact_detail_base.html.

Dependencies

q-sand-deterministic-replay (wave-1) — env_hash + requirements.lock

capture.

scidex/atlas/artifact_registry.py:338 — _upsert_processing_step

is the source of truth.

q-obs-trace-id-propagation — trace_id used to find the original

causal chain when processing_steps is incomplete.

Dependents

q-repro-audit-trail-completeness consumes the replay manifest.

Work Log

2026-04-27 13:30 UTC — Committed implementation:

- scidex/atlas/rerun_artifact.py (353 lines): BFS chain walking, frozen-input check, manifest generation
- scidex/atlas/replay_handlers.py (155 lines): registry with cite/fork handlers, unreplayable stubs
- tests/test_rerun_artifact.py (355 lines): 5 tests covering all acceptance criteria
- cli.py: Added rerun-artifact subcommand with --from/--out/--strict/--diff flags
- api.py: Added POST /api/artifacts/{artifact_id}/rerun endpoint with background execution; added Rerun button to artifact detail page (version panel + versions tab)
- Commit: 0ac5ca425 — [Forge] Add scidex rerun-artifact CLI + API + web button [task:402dd97b-9476-4d6c-9efc-8e17af395221]
- Note: v1 frozen-env is a stub (returns True); full implementation requires env_hash column on processing_steps