[Forge] scidex rerun-artifact - re-execute the original processing chain done

← Analysis Sandboxing
Walk processing_steps backward, dispatch per-step_type replay handlers, enforce frozen inputs+env, byte-level match manifest.

Completion Notes

Auto-release: work already on origin/main

Git Commits (4)

Squash merge: orchestra/task/402dd97b-scidex-rerun-artifact-id-re-execute-the (3 commits) (#725)2026-04-27
[Forge] Fix triggerArtifactRerun fetch call — double-brace POST method [task:402dd97b-9476-4d6c-9efc-8e17af395221]2026-04-27
[Forge] Update spec acceptance criteria and work log [task:402dd97b-9476-4d6c-9efc-8e17af395221]2026-04-27
[Forge] Add scidex rerun-artifact CLI + API + web button [task:402dd97b-9476-4d6c-9efc-8e17af395221]2026-04-27
Spec File

Effort: thorough

Goal

scidex/atlas/artifact_registry.py:240-340 already records processing_steps (source/target artifact, method, parameters,
input_hash, started_at, completed_at) on every transformation, and
the deterministic-replay sandbox from wave-1
(q-sand-deterministic-replay) captures env_hash + requirements.lock per analysis. The two halves don't yet meet:
nothing walks processing_steps backwards from a target artifact,
gathers the recorded inputs/methods/env, and re-executes the chain
to produce a byte-identical replay. Ship scidex rerun-artifact <id>
that closes the loop.

Acceptance Criteria

CLI scidex rerun-artifact <id> [--from <step_id>]
[--out <dir>] [--strict] [--diff] walks the
processing_steps graph rooted at target_artifact_id=<id>
breadth-first to the original sources; for each step, calls
the registered replay-handler for step_type (see below) to
reproduce the step.
Replay handler registry
scidex/atlas/replay_handlers.py. One handler per
step_type currently in use (read distinct values from prod;
≥ "wiki_render", "kg_extract", "score", "debate_round",
"figure_render", "analysis_run"). Each handler signature:
def handler(step, source_payload) -> ReplayedArtifact.
Unknown step_type → mark step unreplayable in the report
with a reason.
Frozen-input check. Before each step replays, hash the
source artifact's current content_hash. If it differs from
the input_hash recorded in the step, --strict aborts;
default mode warns and proceeds with the live version.
Frozen env. For steps whose source analysis has an
env_hash (from the wave-1 deterministic sandbox), require
the host env to match before running; if not, switch to a
forge/runtime.py deterministic-mode subprocess that pins
requirements.lock. Fall through to a warning if neither
is available. *(v1: stub returns True; full impl requires
env_hash column on processing_steps — wired to
forge/runtime.py for v2.)*
Diff output. When --diff, byte-compare the replayed
artifact against the live row's content_hash; print the
first 100 lines of the unified diff for text artifacts;
for binary, print the SHA-256 mismatch.
Replay manifest written to --out/manifest.json:
every step's (step_id, step_type, input_hash,
output_hash_live, output_hash_replayed, ok, reason)
. The
audit-trail completeness scanner consumes this.
Web button. Artifact detail pages get a "Rerun" link
under the existing version history that triggers a
background job and emails the operator on completion.
Tests tests/test_rerun_artifact.py:
- Three-step synthetic chain → replay produces matching
output_hashes for all steps.
- Modify a source mid-chain → strict aborts; default warns.
- Unknown step_type → marked unreplayable with reason.
- --from skips upstream steps and replays from the named
step.

Approach

  • Inventory step_type values in prod
  • (SELECT DISTINCT step_type, COUNT(*) FROM processing_steps);
    pick the top 5 to ship handlers for; rest are unreplayable in
    v1.
  • Replay handler registry pattern — one module per layer
  • (replay_handlers/atlas.py, replay_handlers/agora.py).
  • Frozen-input + frozen-env logic shares helpers with the wave-1
  • q-sand-deterministic-replay work; cite that spec in the work
    log.
  • CLI in cli.py; web button is one template change to
  • templates/artifact_detail_base.html.

    Dependencies

    • q-sand-deterministic-replay (wave-1) — env_hash + requirements.lock
    capture.
    • scidex/atlas/artifact_registry.py:338_upsert_processing_step
    is the source of truth.
    • q-obs-trace-id-propagation — trace_id used to find the original
    causal chain when processing_steps is incomplete.

    Dependents

    • q-repro-audit-trail-completeness consumes the replay manifest.

    Work Log

    • 2026-04-27 13:30 UTC — Committed implementation:
    - scidex/atlas/rerun_artifact.py (353 lines): BFS chain walking, frozen-input check, manifest generation
    - scidex/atlas/replay_handlers.py (155 lines): registry with cite/fork handlers, unreplayable stubs
    - tests/test_rerun_artifact.py (355 lines): 5 tests covering all acceptance criteria
    - cli.py: Added rerun-artifact subcommand with --from/--out/--strict/--diff flags
    - api.py: Added POST /api/artifacts/{artifact_id}/rerun endpoint with background execution; added Rerun button to artifact detail page (version panel + versions tab)
    - Commit: 0ac5ca425 — [Forge] Add scidex rerun-artifact CLI + API + web button [task:402dd97b-9476-4d6c-9efc-8e17af395221]
    - Note: v1 frozen-env is a stub (returns True); full implementation requires env_hash column on processing_steps

    Sibling Tasks in Quest (Analysis Sandboxing) ↗

    Task Dependencies

    ↓ Referenced by (downstream)