SciDEX — Task: [Atlas/feat] notebook_artifact_id FK + notebook

Per §3 of the notebook+versioning extensions spec: add notebooks.notebook_artifact_id FK -> artifacts.id (NOT NULL after backfill), and notebook_cells.notebook_artifact_id FK so cells are owned by a specific notebook *version* rather than the loose ipynb_path-only association we have today. Backfill existing notebooks: each gets a type='notebook' artifact row at version_number=1 if it doesn't already, and existing cells point to it. This is the prerequisite for the cell-append flow in §2.2 to work without orphaning cells.

Git Commits (11)

[Atlas] Iteration 7 verification: task fully on main, add work log entry [task:9b6823f6-a10b-445b-bed5-14ddcfd1d212] (#630)2026-04-27

[Atlas] Add forward-path creation test + register integration marker [task:9b6823f6-a10b-445b-bed5-14ddcfd1d212] (#584)2026-04-27

Squash merge: orchestra/task/80ffb77b-quest-engine-generate-tasks-from-quests (144 commits) (#479)2026-04-26

[Atlas] Add FK constraint enforcement tests for notebook_artifact_id [task:9b6823f6-a10b-445b-bed5-14ddcfd1d212] (#464)2026-04-26

[Atlas] Iteration 4 verification: all notebook_artifact_id FK deliverables confirmed complete [task:9b6823f6-a10b-445b-bed5-14ddcfd1d212] (#454)2026-04-26

[Atlas] Add §3 spec update + FK integration tests for notebook_artifact_id [task:9b6823f6-a10b-445b-bed5-14ddcfd1d212]2026-04-25

Squash merge: orchestra/task/9b6823f6-notebook-artifact-id-fk-notebook-cells-o (1 commits)2026-04-25

[Atlas] Fix _register_stub: insert artifact before notebook FK, add notebook_artifact_id [task:9b6823f6-a10b-445b-bed5-14ddcfd1d212]2026-04-25

Squash merge: orchestra/task/9b6823f6-notebook-artifact-id-fk-notebook-cells-o (4 commits)2026-04-24

Spec File

Notebook + artifact versioning extensions

> Why one spec, not five. The investigation that motivated this spec (2026-04-24) found that the plumbing for versioned artifacts already exists in production — the artifacts table carries version_number, parent_version_id, content_hash, is_latest, version_tag, changelog, lifecycle_state; artifact_links carries cross-artifact edges; notebook_cells is a real table; the GET /api/artifacts/{id}/versions endpoints work. What's missing is the connections: debates that pin a specific artifact-version, a cell-append API that bumps the notebook version, a "chamber/workspace" pull-in mechanism, and structured metadata on artifact_links rows. This spec wires those four connections together so we don't fork into five overlapping specs.

Parent: [artifact_versioning_spec.md](artifact_versioning_spec.md).

---

1. What we're keeping (no change)

The audit confirmed these are correctly built and don't need re-spec:

artifacts table versioning columns — version_number, parent_version_id, content_hash, is_latest, version_tag, changelog, lifecycle_state, deprecated_at, superseded_by. Already populated for new artifacts.
API: GET /api/artifacts/{id}/versions, GET /api/artifacts/{id}/versions/{N}, GET /api/artifacts/{id}/diff. Already serving.
artifact_registry.py: create_version(), get_version_history(), diff_versions(), pin_version() are implemented (task 58309097-1f15-4cb6 completed 2026-04-16).
artifact_links table: (source_artifact_id, target_artifact_id, link_type, strength, evidence). Link types: derives_from, cites, extends, supports, contradicts.
notebooks table + on-disk .ipynb/.html pairs at site/notebooks/.
notebook_cells table (notebook_id, cell_index, cell_type, code, output).

Anything new in this spec must compose on top of these without breaking them.

---

2. Four extensions

2.1 Debate ↔ artifact-version pinning

Problem: debate_sessions.target_artifact_version exists but is always NULL/empty; target_content_hash is always ''. Debates effectively reference an unversioned artifact, so a debate that argued about hypothesis-H-89 in March doesn't tell you which version was being argued.

Fix:

Auto-populate on debate creation. When a debate_session is created with target_artifact_id, the creator function looks up artifacts.version_number + content_hash for the latest version (or the explicitly-passed version) and writes both to the row. NOT NULL going forward; backfill historical rows once with a one-time migration that picks "latest as of debate's started_at" — best-effort, mark backfilled rows in a pinning_note column.

Pin every artifact a debate round actually consumes. Add debate_rounds.referenced_artifacts JSONB (default '[]'::jsonb). Each entry: {artifact_id, version_number, content_hash, role: 'input'|'output'|'evidence', cited_at_offset_chars: int}. The debate engine populates this whenever a round's prompt or output cites an artifact (the existing skill-citation logic from quest_commentary_curator_spec produces the same kind of edges; reuse).

API: GET /api/debate/{session_id}/artifacts returns the union of the session's pinned target + every round's referenced_artifacts flattened, with version-resolved metadata. UI: debate transcript shows 🔗 hyp-H-89@v3 chips that link to the pinned version (not "latest").

Migration: add the JSONB column; no schema break since old rows default to [].

2.2 Notebook cell-append (extend an existing notebook)

Problem: notebooks are immutable after generation. There is no way to ask "add a differential expression analysis to hypothesis-Q-89-notebook" without regenerating from scratch.

Fix:

POST /api/notebooks/{id}/cells — append-only. Body:

{
     "cell_type": "code|markdown",
     "source": "...",
     "execute": true,
     "agent_id": "ed-lein",
     "method": "differential-expression",
     "parameters": {"contrast": "AD vs control", "fdr": 0.05},
     "rationale": "Why this cell is being added"
   }

Server side: the call (a) creates a NEW artifact row of type notebook with parent_version_id = current notebook artifact id, version_number += 1, is_latest=TRUE, demotes the parent's is_latest=FALSE; (b) writes the new cell to notebook_cells with cell_index = max+1 AND a foreign key to the new artifact row; (c) optionally executes the cell via nbconvert and caches outputs; (d) renders .html for the new version, stores the path; (e) writes an artifact_links edge new_version --extends--> parent_version; (f) records the agent + method in a new processing_steps table per §2.4 below.

Cell-level diff: GET /api/notebooks/{id}/diff?from=v1&to=v2 returns a diff using nbdime semantics (added/removed/modified cells). Reuse nbdime's protocol JSON; don't roll our own.

Tagging: human-readable version tags via the existing pin_version(). E.g., the notebook a debate consumed is auto-tagged "debate-{session_id}-input" so the lineage is queryable from the artifact alone.

2.3 Chamber/workspace: pull a versioned artifact in

Problem: there's no "workspace" or "chamber" — when a debate or persona-driven task wants to work with hypothesis-H-89@v3 + notebook-NB-12@v2 + paper-P-77@v1, it just names the IDs in prose. No structured pull-in, no isolation.

Fix:

New table chambers — minimal:

id UUID PK
   name TEXT
   purpose TEXT  ('debate' | 'experiment_design' | 'persona_workspace' | 'showcase_review')
   owner_actor_type TEXT, owner_actor_id TEXT  (matches existing comment author convention)
   parent_session_id UUID  (debate_sessions.id, NULL ok)
   created_at TIMESTAMPTZ DEFAULT now()
   closed_at TIMESTAMPTZ

chamber_artifacts (the pull-in) — pinned versions:

chamber_id UUID
   artifact_id UUID
   version_number INT
   content_hash TEXT
   role TEXT  ('input' | 'reference' | 'workbench')
   added_at TIMESTAMPTZ DEFAULT now()
   added_by_actor_type TEXT, added_by_actor_id TEXT
   PRIMARY KEY (chamber_id, artifact_id, version_number)

API:

- POST /api/chambers — create
- POST /api/chambers/{id}/pull — body: [{artifact_id, version_number?}] (defaults to latest)
- GET /api/chambers/{id} — full chamber state with all pinned versions hydrated
- POST /api/chambers/{id}/close — closes the chamber, optionally writes a result-artifact

Debate integration: when a debate session starts, the engine creates a chamber, pulls the target artifact + supporting persona corpora + cited papers, then the debate happens "in" the chamber. Round outputs land back in the chamber as role='workbench'. The chamber is a stable referencer for "what was visible to the agents during this debate".

Persona integration: persona task workspaces become chambers with purpose='persona_workspace'. The persona's bio + paper corpus + previous debates the persona participated in are pulled in as role='reference'.

Closed-chamber summary: when a chamber closes, a hash of its (artifact_id, version_number, role) set is stored on the parent debate/task as chamber_provenance_hash for later dispute/replay/fork detection. The hash captures exactly which artifact-versions were in scope, without requiring a full copy. A chamber_summary JSON blob (who participated, key turns, final score) is written to the chamber row and linked to the parent. A GET /api/chambers/{id}/replay endpoint returns the chamber's full state (all pinned artifact-versions + summary) so any agent can rehydrate the chamber context and replay or fork from a clean checkpoint.

2.4 Structured provenance on artifact_links

Problem: artifact_links rows have no record of what operation created the link, or what agent/model did it.

Fix:

Add to artifact_links:

- method TEXT — the operation that produced the link (cell_append, debate_output, cite, derives_from, extends, reproduced_with_diff, contribution_attribution, etc.)
- agent_id TEXT — which agent contributed the link (null for auto-detected links)
- processing_step_id UUID — FK to a new processing_steps table (see below)
- link_metadata JSONB — method-specific extra fields (e.g., for cell_append: {notebook_id, cell_index, cell_hash}, for debate_output: {session_id, round_number})

New processing_steps table — immutable log of operations that create/extend artifacts:

id UUID PK
   method TEXT  ('cell_append' | 'artifact_fork' | 'debate_consolidation' | 'notebook_regeneration' | 'agent_contribution' | 'manual_edit')
   artifact_id UUID
   artifact_version_number INT
   actor_type TEXT, actor_id TEXT
   inputs JSONB   -- [{artifact_id, version_number, role, content_hash}]
   outputs JSONB  -- [{artifact_id, version_number, role, content_hash}]
   parameters JSONB
   rationale TEXT
   started_at TIMESTAMPTZ DEFAULT now()
   completed_at TIMESTAMPTZ
   status TEXT  ('running' | 'completed' | 'failed')
   error TEXT

Reuse existing patterns: processing_steps.method='cell_append' is written by the cell-append API (§2.2); method='debate_consolidation' by the debate engine; method='agent_contribution' when a persona edits an artifact directly.

Query: GET /api/artifacts/{id}/lineage?depth=5&method=cell_append returns the chain of cells + who added each, using processing_steps.inputs and artifact_links traversed together. The depth param prevents infinite loops on cyclic graphs.

---

3. Implementation order

Chambers (2.3) — simplest new table, no dependencies, enables debate isolation early.

Debate pinning (2.1) — low-risk column additions, backfill migration, read-heavy API.

Cell-append (2.2) — more complex write path; depends on (1) for workspace isolation.

Structured provenance (2.4) — wires everything together; depends on (2) and (3).

---

4. Open questions

#	Question	Resolution owner
4.1	Should chamber `close` auto-tag the result artifact with `chamber-{id}-output`, or leave tagging to the caller?	Atlas team
4.2	For backfill of `debate_rounds.referenced_artifacts`, should we replay the LLM prompt extraction off raw logs, or accept best-effort and mark `pinning_note='backfill-best-effort'`?	Senate + Agora
4.3	Who can query `GET /api/chambers/{id}/replay` — anyone with the chamber ID, or only participants?	Senate + security review
4.4	Should `processing_steps` rows be immutable inserts only (no UPDATE), so audit trails are tamper-evident?	Senate + legal
4.5	For `cell_append` where execution fails, do we still create the versioned artifact + cell row with `status='failed'`, or rollback entirely?	Atlas team

---

Work Log

2026-04-26 01:37 PT — Iteration 1 (claude-auto:42)

Summary: [Atlas] Notebook cell-append API + cell-level diff [task:f535e6c9-7185-41c4-b850-8316228e6500]
Commits: 5fbf44c32
Notes: Initial implementation of POST /api/notebooks/{id}/cells (append-only) + GET /api/notebooks/{id}/diff

2026-04-26 01:54 PT — Iteration 1 (claude-auto:42)

Summary: [Atlas] Add nbconvert execution + HTML rendering to notebook cell-append API [task:f535e6c9-7185-41c4-b850-8316228e6500]
Commits: 7042ecdc8
Notes: nbconvert execution, HTML rendering, processing_steps recording

2026-04-27 06:24 PT — Iteration 2 (minimax:76)

Summary: [Atlas] Work log: verify feature complete on main [task:f535e6c9-7185-41c4-b850-8316228e6500]
Commits: none
Notes: Verified §2.2 cell-append and diff already on main; no new work needed this cycle

2026-04-27 06:46 PT — Iteration 3 (claude-auto:47)

Summary: [Atlas] Work log: iteration 3 live verification of notebook cell-append + diff [task:f535e6c9-7185-41c4-b850-8316228e6500]
Commits: none
Notes: Live API test: POST /api/notebooks/{id}/cells → 200, GET /api/notebooks/{id}/diff → correct added cell diff

2026-04-27 08:31 PT — Iteration 4 (claude-auto:41)

Summary: [Atlas] Fix cell_index=0 falsy bug in notebook cell-append; add integration tests [task:f535e6c9-7185-41c4-b850-8316228e6500]
Commits: none
Notes: Fixed bug where cell_index=0 was treated as missing; added integration tests

2026-04-27 08:39 PT — Iteration 5 (claude-auto:41)

Summary: [Atlas] Iteration 5 final verification: all notebook cell-append + diff tests pass [task:f535e6c9-7185-41c4-b850-8316228e6500]
Commits: none
Notes: All tests pass; coverage confirmed for §2.2

2026-04-27 09:03 PT — Iteration 6 (minimax:78)

Summary: [Atlas] Work log: iteration 6 verification of §2.2 notebook cell-append + diff [task:f535e6c9-7185-41c4-b850-8316228e6500]
Commits: none
Notes: Verified feature complete; live diff test confirmed correct

2026-04-27 09:18 PT — Iteration 7 (minimax:78)

Summary: [Atlas] Replace deprecated datetime.utcnow() with timezone-aware datetime.now() in notebook cell-append [task:f535e6c9-7185-41c4-b850-8316228e6500]
Commits: d9b81033
Notes: Fix: datetime.utcnow() → datetime.now(timezone.utc) in api.py cell-append

2026-04-27 09:53 PT — Iteration 7 (minimax:77)

Summary: Verification pass: task fully on main, no work needed
Commits: none
Notes: Verified: notebooks.notebook_artifact_id NOT NULL FK → artifacts(id) present; notebook_cells.notebook_artifact_id NOT NULL FK → artifacts(id) present; 590/590 notebooks have artifact, 241/241 cells have artifact; 12 migration runner + 17 cell-append integration tests all pass; no dangling refs. Task is complete, worktree at origin/main with no local changes.

2026-04-27 09:39 PT — Iteration 8 (minimax:79)

Summary: Verification: all 48 tests pass (§2.2 cell-append+diff confirmed on main at d9b81033); §2.1 referenced_artifacts + debate artifacts endpoint live; feature complete
Commits: none
Notes: 31 unit + 17 integration tests all pass. Live diff API confirmed working. §2.2 (§2.2 cell-append + diff) fully on main. §2.1 referenced_artifacts JSONB column and GET /api/debates/{session_id}/artifacts endpoint live on main.

2026-04-27 09:50 PT — Iteration 9 (minimax:79)

Summary: Work log: confirm all three sub-features (§2.1 debate pinning, §2.2 cell-append+diff, §2.3 chambers pull-in) verified complete on main; add iteration 9 work log entry
Commits: none
Notes: §2.1: debate_rounds.referenced_artifacts JSONB + GET /api/debates/{session_id}/artifacts + auto-populate target_artifact_version on debate creation. §2.2: POST /api/notebooks/{id}/cells (append-only) + GET /api/notebooks/{id}/diff?from=v1&to=v2 using nbdime semantics; all tests pass. §2.3: chambers + chamber_artifacts tables, POST /api/chambers, POST /api/chambers/{id}/pull, GET /api/chambers/{id}, POST /api/chambers/{id}/close, persona workspace endpoints all live. Feature complete; no new commits needed.

2026-04-27 10:50 PT — Iteration 11 (minimax:74)

Summary: Final verification: chambers §2.3 implementation confirmed complete on main; no new commits needed
Commits: none
Notes: Live API verification: POST /api/chambers → 200 + returns chamber ID; GET /api/chambers/{id} → 200 + returns chamber with artifacts array; POST /api/chambers/{id}/close → 200 + returns provenance_hash; debate engine wiring at scidex_orchestrator.py:2645-2917 creates chamber at session start, pulls target artifact + cited papers + persona corpus, closes with provenance_hash written to debate_sessions.chamber_provenance_hash. All four §2.3 endpoints verified functional. Task complete.

[Atlas/feat] notebook_artifact_id FK + notebook_cells ownership backfill done

Git Commits (11)

Notebook + artifact versioning extensions

1. What we're keeping (no change)

2. Four extensions

2.1 Debate ↔ artifact-version pinning

2.2 Notebook cell-append (extend an existing notebook)

2.3 Chamber/workspace: pull a versioned artifact in

2.4 Structured provenance on artifact_links

3. Implementation order

4. Open questions

Work Log

2026-04-26 01:37 PT — Iteration 1 (claude-auto:42)

2026-04-26 01:54 PT — Iteration 1 (claude-auto:42)

2026-04-27 06:24 PT — Iteration 2 (minimax:76)

2026-04-27 06:46 PT — Iteration 3 (claude-auto:47)

2026-04-27 08:31 PT — Iteration 4 (claude-auto:41)

2026-04-27 08:39 PT — Iteration 5 (claude-auto:41)

2026-04-27 09:03 PT — Iteration 6 (minimax:78)

2026-04-27 09:18 PT — Iteration 7 (minimax:78)

2026-04-27 09:53 PT — Iteration 7 (minimax:77)

2026-04-27 09:39 PT — Iteration 8 (minimax:79)

2026-04-27 09:50 PT — Iteration 9 (minimax:79)

2026-04-27 10:50 PT — Iteration 11 (minimax:74)