[Atlas] Processing step lineage — track transforms in provenance chains open

← Schema Governance
processing_steps table capturing agent, method, parameters, timing, hashes for reproducibility

Last Error

cli-reopen-manual: reopened — task was marked 'done' but has no task_runs row in (done/completed/success)

Git Commits (7)

[Atlas/Senate/Agora] Spec: notebook + artifact versioning extensions2026-04-24
Squash merge: orchestra/task/sen-sg-0-schema-registry-track-schemas-per-artifa (1 commits)2026-04-18
Squash merge: orchestra/task/47b17cbf-sen-sg-01-sreg-schema-registry-track-art (1 commits)2026-04-16
[Senate] Add schema registry API: GET /api/schemas and /api/schemas/{type} in api.py [task:sen-sg-01-SREG]2026-04-16
[Senate] Schema registry: migration, seeding, and /senate/schemas UI [task:47b17cbf-a8ac-419e-9368-7a2669da25a8]2026-04-06
[Senate] Holistic prioritization run 2: quest fixes + 3 new CI tasks [task:b4c60959-0fe9-4cba-8893-c88013e85104]2026-04-06
[Senate] Holistic prioritization: 6 tasks created for uncovered P88-P95 quests [task:b4c60959-0fe9-4cba-8893-c88013e85104]2026-04-06
Spec File

Goal

Extend the provenance system to capture not just parent-child artifact relationships but
the processing steps between them. When an experiment is extracted from a paper, the
provenance should record: "Paper 12345 was processed by extraction-agent using
llm_structured_extraction method with schema v2, producing experiment artifact X."

This creates a full audit trail of how every artifact was constructed.

Current State

  • artifact_links captures derives_from, cites, extends relationships
  • provenance_chain JSON in artifacts captures parent artifacts
  • Neither captures the transform applied (what method, what agent, what parameters)

Acceptance Criteria

processing_steps table or extended artifact_links metadata:
- source_artifact_id — input artifact
- target_artifact_id — output artifact
- step_type — extraction, analysis, aggregation, transformation, validation, debate
- agent_id — which agent performed the step
- method — what method/tool was used
- parameters — JSON of method parameters
- started_at, completed_at — timing
- input_hash, output_hash — for reproducibility verification
record_processing_step() function called during artifact creation
☐ Processing steps shown in provenance graph visualization
☐ Reproducibility check: same input + same method + same parameters = same output hash?
☐ API: GET /api/artifact/{id}/processing-history — full transform chain

Dependencies

  • None (parallel with schema governance, integrates with provenance system)

Dependents

  • a17-24-REPR0001 — Reproducible analysis chains use processing steps
  • d16-24-PROV0001 — Provenance demo showcases processing lineage

Work Log

Sibling Tasks in Quest (Schema Governance) ↗

Task Dependencies

↓ Referenced by (downstream)