Notebook + artifact versioning extensions

← All Specs

Notebook + artifact versioning extensions

> Why one spec, not five. The investigation that motivated this spec (2026-04-24) found that the plumbing for versioned artifacts already exists in production — the artifacts table carries version_number, parent_version_id, content_hash, is_latest, version_tag, changelog, lifecycle_state; artifact_links carries cross-artifact edges; notebook_cells is a real table; the GET /api/artifacts/{id}/versions endpoints work. What's missing is the connections: debates that pin a specific artifact-version, a cell-append API that bumps the notebook version, a "chamber/workspace" pull-in mechanism, and structured metadata on artifact_links rows. This spec wires those four connections together so we don't fork into five overlapping specs.

Parent: [artifact_versioning_spec.md](artifact_versioning_spec.md).

---

1. What we're keeping (no change)

The audit confirmed these are correctly built and don't need re-spec:

  • artifacts table versioning columnsversion_number, parent_version_id, content_hash, is_latest, version_tag, changelog, lifecycle_state, deprecated_at, superseded_by. Already populated for new artifacts.
  • API: GET /api/artifacts/{id}/versions, GET /api/artifacts/{id}/versions/{N}, GET /api/artifacts/{id}/diff. Already serving.
  • artifact_registry.py: create_version(), get_version_history(), diff_versions(), pin_version() are implemented (task 58309097-1f15-4cb6 completed 2026-04-16).
  • artifact_links table: (source_artifact_id, target_artifact_id, link_type, strength, evidence). Link types: derives_from, cites, extends, supports, contradicts.
  • notebooks table + on-disk .ipynb/.html pairs at site/notebooks/.
  • notebook_cells table (notebook_id, cell_index, cell_type, code, output).

Anything new in this spec must compose on top of these without breaking them.

---

2. Four extensions

2.1 Debate ↔ artifact-version pinning

Problem: debate_sessions.target_artifact_version exists but is always NULL/empty; target_content_hash is always ''. Debates effectively reference an unversioned artifact, so a debate that argued about hypothesis-H-89 in March doesn't tell you which version was being argued.

Fix:

  • Auto-populate on debate creation. When a debate_session is created with target_artifact_id, the creator function looks up artifacts.version_number + content_hash for the latest version (or the explicitly-passed version) and writes both to the row. NOT NULL going forward; backfill historical rows once with a one-time migration that picks "latest as of debate's started_at" — best-effort, mark backfilled rows in a pinning_note column.
  • Pin every artifact a debate round actually consumes. Add debate_rounds.referenced_artifacts JSONB (default '[]'::jsonb). Each entry: {artifact_id, version_number, content_hash, role: 'input'|'output'|'evidence', cited_at_offset_chars: int}. The debate engine populates this whenever a round's prompt or output cites an artifact (the existing skill-citation logic from quest_commentary_curator_spec produces the same kind of edges; reuse).
  • API: GET /api/debate/{session_id}/artifacts returns the union of the session's pinned target + every round's referenced_artifacts flattened, with version-resolved metadata. UI: debate transcript shows 🔗 hyp-H-89@v3 chips that link to the pinned version (not "latest").
  • Migration: add the JSONB column; no schema break since old rows default to [].
  • 2.2 Notebook cell-append (extend an existing notebook)

    Problem: notebooks are immutable after generation. There is no way to ask "add a differential expression analysis to hypothesis-Q-89-notebook" without regenerating from scratch.

    Fix:

  • POST /api/notebooks/{id}/cells — append-only. Body:

  • {
         "cell_type": "code|markdown",
         "source": "...",
         "execute": true,
         "agent_id": "ed-lein",
         "method": "differential-expression",
         "parameters": {"contrast": "AD vs control", "fdr": 0.05},
         "rationale": "Why this cell is being added"
       }

  • Server side: the call (a) creates a NEW artifact row of type notebook with parent_version_id = current notebook artifact id, version_number += 1, is_latest=TRUE, demotes the parent's is_latest=FALSE; (b) writes the new cell to notebook_cells with cell_index = max+1 AND a foreign key to the new artifact row; (c) optionally executes the cell via nbconvert and caches outputs; (d) renders .html for the new version, stores the path; (e) writes an artifact_links edge new_version --extends--> parent_version; (f) records the agent + method in a new processing_steps table per §2.4 below.
  • Cell-level diff: GET /api/notebooks/{id}/diff?from=v1&to=v2 returns a diff using nbdime semantics (added/removed/modified cells). Reuse nbdime's protocol JSON; don't roll our own.
  • Tagging: human-readable version tags via the existing pin_version(). E.g., the notebook a debate consumed is auto-tagged "debate-{session_id}-input" so the lineage is queryable from the artifact alone.
  • 2.3 Chamber/workspace: pull a versioned artifact in

    Problem: there's no "workspace" or "chamber" — when a debate or persona-driven task wants to work with hypothesis-H-89@v3 + notebook-NB-12@v2 + paper-P-77@v1, it just names the IDs in prose. No structured pull-in, no isolation.

    Fix:

  • New table chambers — minimal:

  • id UUID PK
       name TEXT
       purpose TEXT  ('debate' | 'experiment_design' | 'persona_workspace' | 'showcase_review')
       owner_actor_type TEXT, owner_actor_id TEXT  (matches existing comment author convention)
       parent_session_id UUID  (debate_sessions.id, NULL ok)
       created_at TIMESTAMPTZ DEFAULT now()
       closed_at TIMESTAMPTZ

  • chamber_artifacts (the pull-in) — pinned versions:

  • chamber_id UUID
       artifact_id UUID
       version_number INT
       content_hash TEXT
       role TEXT  ('input' | 'reference' | 'workbench')
       added_at TIMESTAMPTZ DEFAULT now()
       added_by_actor_type TEXT, added_by_actor_id TEXT
       PRIMARY KEY (chamber_id, artifact_id, version_number)

  • API:
  • - POST /api/chambers — create
    - POST /api/chambers/{id}/pull — body: [{artifact_id, version_number?}] (defaults to latest)
    - GET /api/chambers/{id} — full chamber state with all pinned versions hydrated
    - POST /api/chambers/{id}/close — closes the chamber, optionally writes a result-artifact
  • Debate integration: when a debate session starts, the engine creates a chamber, pulls the target artifact + supporting persona corpora + cited papers, then the debate happens "in" the chamber. Round outputs land back in the chamber as role='workbench'. The chamber is a stable referencer for "what was visible to the agents during this debate".
  • Persona integration: persona task workspaces become chambers with purpose='persona_workspace'. The persona's bio + paper corpus + previous debates the persona participated in are pulled in as role='reference'.
  • Closed-chamber summary: when a chamber closes, a hash of its (artifact_id, version_number, role) set is stored on the parent debate/task as chamber_provenance_hash so reproducibility is one query: "what was this debate looking at?".
  • 2.4 Structured processing_steps (lineage + linking metadata)

    Problem: artifact_links only records (src, tgt, link_type, strength, evidence). We don't know who created the edge or with what method/parameters. Meanwhile sen-sg-06-PROC_processing_step_lineage_spec.md proposed exactly this but isn't implemented.

    Fix:

  • New table processing_steps (the missing piece from sen-sg-06):

  • id UUID PK
       source_artifact_id UUID, source_version_number INT
       target_artifact_id UUID, target_version_number INT
       step_type TEXT  ('cell_append' | 'cite' | 'derive' | 'merge' | 'split' | 'fork')
       actor_type TEXT, actor_id TEXT  (agent / persona / user)
       method TEXT  (free-form; e.g. 'differential-expression', 'nbdime-merge', 'manual-edit')
       parameters JSONB
       chamber_id UUID  (nullable; the chamber that hosted the step)
       created_at TIMESTAMPTZ DEFAULT now()

  • Backfill rule: every existing artifact_links row gets a corresponding processing_steps row with actor_type='unknown', method='legacy_link', step_type='cite' (or whatever maps from link_type). Future writes go through both tables.
  • API: GET /api/artifacts/{id}/lineage returns the full forward + backward processing-step graph for an artifact (covers the user's "we want to be able to link across artifacts" + "trace what this came from" intent).
  • UI: lineage view becomes a DAG rendering of processing_steps rather than a flat link list.
  • ---

    3. Notebook-specific versioning extension to artifact_versioning_spec

    Add one short subsection to the existing artifact_versioning_spec.md:

    > §N+1 Notebook-specific versioning. Notebook artifacts (artifact_type='notebook') inherit all standard versioning fields. In addition: notebooks.notebook_artifact_id FK → artifacts.id is required (replaces the loose ipynb_path-only association); notebook_cells.notebook_artifact_id carries the same FK so cells are owned by a specific version of the notebook (when the notebook is bumped to v2, its cells reference v2's artifact id, not v1's). Cell-append (§2.2 of the extensions spec) is the canonical append path.

    That's a 4-line edit, not a separate spec.

    ---

    4. Implementation tasks (seed)

    Each becomes an iterative task at priority 92 (just below the persona/Allen-experiment tasks at 92-99). All task_type=iterative, max_iterations=3.

  • feat/debate-version-pin — debate_sessions auto-population + referenced_artifacts JSONB migration + GET /api/debate/{id}/artifacts endpoint + UI version-chip rendering. (§2.1)
  • feat/notebook-cell-appendPOST /api/notebooks/{id}/cells + nbconvert execution + nbdime-style diff endpoint. Bumps the notebook artifact version. (§2.2)
  • feat/chambers-and-pull-inchambers + chamber_artifacts tables + 4 endpoints + debate-engine integration so every debate session opens a chamber on start and closes it with a provenance hash. (§2.3)
  • feat/processing-steps-lineageprocessing_steps table + dual-write from artifact_links + /api/artifacts/{id}/lineage endpoint + DAG view. (§2.4)
  • feat/notebook-version-FK-backfill — adds notebook_artifact_id FK + backfills existing notebooks. (§3)
  • Each carries acceptance criteria like "new artifacts in scope have version_number > 1 from a real cell-append flow, not just stub data" and "a debate that ran in the last 24 hours has at least one referenced_artifact populated by the engine".

    ---

    5. Why a quest is overkill

    This is plumbing, not a generation/valuation system. There's no debate-driven multi-agent loop here — it's straightforward extend-the-schema-and-wire-the-API work that 5 iterative tasks can drive to completion in 1-2 weeks of fleet time. If we discover during implementation that some piece (e.g., chamber lifecycle, cell-execution sandboxing) is genuinely complex enough to need its own quest, we'd carve it off then. For now, 5 child tasks under the existing Artifacts quest is sufficient.

    ---

    6. Acceptance: how we know this works end-to-end

    A new debate run creates a chamber, pulls hypothesis-H-89@v3 + a paper-P-77@v1 + ed-lein persona-context, runs three rounds where round 2 cites a new differential-expression cell that was appended to NB-12@v2 to produce NB-12@v3, the chamber closes, and the debate-session row carries the chamber's provenance hash. Querying /api/debate/{id}/artifacts returns the four pinned versions. Querying /api/artifacts/NB-12/lineage shows v1 → v2 (legacy fork) → v3 (cell-append by ed-lein) with method, parameters, and the chamber id. That's the demonstration test.

    ---

    Work Log

    2026-04-25 17:25 PT — Codex slot 51

    • Staleness-reviewed task 7ba524d5-a13c-423f-a674-30e642eb037e against origin/main at 41262d1128b78a4fb992d0d70d82a1151eb3fc20; confirmed processing_steps table and /api/artifacts/{id}/lineage endpoint are still missing.
    • Read AGENTS.md, /home/ubuntu/Orchestra/AGENTS.md, and the sibling lineage spec sen-sg-06-PROC_processing_step_lineage_spec.md.
    • Implementation approach for this iteration:
    1. Add a canonical processing_steps schema/bootstrap path with idempotent backfill from artifact_links.
    2. Route artifact-link writes through dual-write helpers so new links also record structured lineage metadata.
    3. Add /api/artifacts/{id}/lineage and focused tests for backfill + endpoint shape.
    • Refined the bootstrap path to avoid repeated full-table backfill scans once processing_steps is caught up; lineage reads and API startup now perform a cheap missing-row check before backfilling.

    2026-04-26 00:35 PT — MiniMax slot 75 (iteration 1)

    • Confirmed stale: branch was 7 commits behind origin/main; rebased cleanly.
    • Found previous iteration had staged changes to api.py (lineage endpoint + bootstrap call) but they were reverted by the rebase since main already has a more advanced api_entity_detail implementation. The artifact_registry.py changes were intact.
    • Re-applied api.py changes: (1) bootstrap_processing_steps() called at API startup in _init_db block, (2) GET /api/artifacts/{artifact_id}/lineage endpoint with max_depth query param.
    • Committed 8175460db: processing_steps table schema, backfill from artifact_links, PostgreSQL trigger for new inserts, dual-write in create_link(), get_processing_lineage() DAG walk, tests passing.
    • Spec work log updated; push follows.

    2026-04-26 01:40 PT — MiniMax slot 70 (iteration 2)

    • Previous iteration's commits (4d8fde367, 72025e786) were lost during rebase conflict resolution.
    • Successfully recovered commits and cherry-picked them onto current HEAD.
    • Current branch HEAD: 9e7adbe3a with all 5 files (artifact_registry.py, api.py, spec, tests, slot.json).
    • Pushed to origin. Task branch diverged from remote (which had 12 additional merge commits from concurrent work); force-pushed to establish our clean version.
    • Tests passing: test_processing_steps_backfill_and_dual_write PASSED.
    • processing_steps table in PG: 1.35M rows already populated (prior backfill from artifact_links).
    • GET /api/artifacts/{artifact_id}/lineage endpoint registered in api.py at line 3851.
    • Enhanced debate chamber creation: now also pulls papers linked to the target artifact via artifact_links (link types cites, derives_from, supports, up to 20 papers) as role='reference' artifacts.
    • Added ON CONFLICT DO NOTHING guards to chamber_artifacts inserts to prevent duplicate-pull errors.
    • Added POST /api/personas/{persona_id}/workspace — creates a purpose='persona_workspace' chamber for a persona, pre-populating with papers from the persona's debate history. Returns existing open workspace if one already exists.
    • Added GET /api/personas/{persona_id}/workspace — returns the open workspace chamber with all pinned artifacts hydrated.
    • Added tests/test_chambers.py covering schema verification, create→pull→close lifecycle, cited-papers pull (verified 14 papers pulled for hypothesis-h-5dbfd3aa), and persona workspace creation.

    Iteration 3 — 2026-04-26 (task:f535e6c9-7185-41c4-b850-8316228e6500)

    Context: Prior task iteration commits (356c0bcb7 and 37ae87dd6) were not merged to main — the implementation commit was stranded outside the main branch ancestry while the test coverage commit contained mostly stale worktree deletions. Starting fresh from current main.

    Implemented:

    • POST /api/notebooks/{notebook_id}/cells — append-only cell endpoint:
    - Validates cell_type (code / markdown / raw)
    - Creates new artifacts row with version_number+1, parent_version_id, is_latest=1
    - Marks previous artifact is_latest=0
    - Inserts into notebook_cells at cell_index=max+1 with FK to new artifact
    - Updates notebooks.notebook_artifact_id to new version
    - Creates artifact_links row with link_type='extends'
    - Records processing_steps row for lineage
    - Returns {notebook_id, new_artifact_id, parent_artifact_id, version_number, cell_index, cell_type}

    • GET /api/notebooks/{notebook_id}/diff?from=N&to=M — cell-level diff:
    - Uses get_version_history_standalone() to resolve lineage
    - Fetches cells for each version from notebook_cells
    - Returns {added, removed, modified, unchanged_count} diff structure

    • tests/test_notebook_cell_append.py — 26 AST-based tests, all passing
    Files touched: api.py, tests/test_notebook_cell_append.py, this spec

    Iteration 4 — 2026-04-26 (task:f535e6c9-7185-41c4-b850-8316228e6500)

    Added nbconvert execution and HTML rendering to cell-append endpoint:

    • POST /api/notebooks/{id}/cells now writes a .ipynb file for the new version (using nbformat)
    • When execute=True: runs jupyter nbconvert --execute on the new notebook file
    • Always attempts HTML rendering via jupyter nbconvert --to html
    • Updates notebooks.file_path and notebooks.rendered_html_path in a post-commit step (failures are non-fatal — DB state stays consistent)
    • Response now includes file_path and rendered_html_path fields
    • Tests expanded from 26 to 30, adding coverage for: nbformat usage, execute flag, HTML rendering, and file path in response
    Files touched: api.py, tests/test_notebook_cell_append.py, this spec

    2026-04-25 20:20 PT — Codex slot 51 (iteration 3)

    • Re-reviewed task 7ba524d5-a13c-423f-a674-30e642eb037e against current origin/main; feature is still absent on main, but the task branch already carries a first-pass implementation.
    • Audited the prior branch work before editing: confirmed the main correctness gap is PostgreSQL bootstrap assuming processing_steps already exists, even though main has no migration or other creation path for it.
    • Implementation approach for this iteration:
    1. Harden ensure_processing_steps_schema() so PostgreSQL can create processing_steps and its indexes idempotently, then normalize compatibility columns.
    2. Extend lineage tests to cover bootstrap idempotence and the PostgreSQL schema-create path explicitly.
    3. Re-run focused tests, then commit with a message that explicitly mentions api.py because the task branch still includes the lineage endpoint there and the previous gate rejection required a critical-file mention in-range.

    Tasks using this spec (5)
    [Atlas/feat] Debate ↔ artifact-version pinning + referenced_
    open P92
    [Atlas/feat] Notebook cell-append API + cell-level diff
    blocked P92
    [Atlas/feat] Chambers + chamber_artifacts (versioned pull-in
    blocked P92
    [Senate/feat] processing_steps lineage table + dual-write fr
    running P92
    [Atlas/feat] notebook_artifact_id FK + notebook_cells owners
    running P92
    File: notebook_artifact_versioning_extensions_spec.md
    Modified: 2026-04-25 19:43
    Size: 18.9 KB