[Atlas] Replication tracking — match experiments testing same hypothesis done

← Experiment Extraction
Cluster experiments by target/relation/direction, track replication status, flag conflicts

Completion Notes

Auto-release: work already on origin/main

Git Commits (16)

Squash merge: orchestra/task/atl-ex-0-api-endpoints-for-experiment-browsing-se (7 commits)2026-04-26
Squash merge: atlas/atl-ex-04-QUAL-push (2 commits)2026-04-26
[Atlas] Update spec work log for extraction quality scoring [task:atl-ex-04-QUAL]2026-04-25
[Atlas] Extraction quality scoring and confidence calibration [task:atl-ex-04-QUAL]2026-04-25
Squash merge: orchestra/task/atl-ex-0-meta-analysis-support-aggregate-results (2 commits)2026-04-25
[Atlas] Update spec work log for extraction quality scoring [task:atl-ex-04-QUAL]2026-04-25
[Atlas] Extraction quality scoring and confidence calibration [task:atl-ex-04-QUAL]2026-04-25
[Atlas] Replication tracking: clustering module + /api/experiments/replication/{entity} [task:atl-ex-05-REPL]2026-04-25
Squash merge: orchestra/task/atl-ex-0-build-llm-extraction-pipeline-from-paper (2 commits)2026-04-15
[Atlas] Update spec work log for atl-ex-02-PIPE [task:atl-ex-02-PIPE]2026-04-15
[Atlas] Improve experiment extraction pipeline with schema-aligned prompts, graceful missing data, and quota-aware rate limiting2026-04-15
Squash merge: orchestra/task/atl-ex-0-backfill-188-existing-experiment-artifac (1 commits)2026-04-15
[Atlas] Backfill 188 experiment artifacts with structured metadata [task:atl-ex-07-BKFL]2026-04-15
[Atlas] Auto-link extracted experiments to KG entities [task:atl-ex-03-LINK]2026-04-13
[Docs] Update atl-ex-01-SCHM work log: implementation complete [task:atl-ex-01-SCHM]2026-04-13
[Atlas] Add experiment extraction constants and validate_experiment_metadata() [task:atl-ex-01-SCHM]2026-04-13
Spec File

Goal

Identify when multiple experiments test the same claim and track whether results replicate,
conflict, or are inconclusive. This is critical for evidence quality — a replicated finding
has much higher trust than a single observation.

Acceptance Criteria

☑ Clustering algorithm that groups experiments by: target entity + relation + direction
replication_groups table or metadata linking replicated experiments
☑ Replication status per group: replicated, conflicting, single_study, inconclusive
☑ Effect size comparison across replication group (forest plot data)
☑ Replication status feeds back into experiment quality scores (replicated = higher trust)
☑ API: GET /api/experiments/replication/{entity} — show replication landscape
☑ Conflicting results flagged as high-priority debate targets

Dependencies

  • atl-ex-04-QUAL — Need quality-filtered experiments for meaningful comparison

Dependents

  • atl-ex-06-META — Meta-analysis builds on replication groups
  • q-artifact-debates — Conflicting experiments are prime debate targets

Work Log

2026-04-25 — Implementation

State at start: replication_groups table (80 rows), replication_clusters/replication_cluster_members tables (empty), 228/632 experiments had replication_group_id set (all via a one-off 2026-04-17 batch). No Python module for clustering, no API endpoint, no effect-size data.

Implemented:

  • scidex/atlas/replication_clustering.py — new module with:
  • - cluster_experiments(db, entity_filter=None): groups experiments by normalised target_gene + inferred relation_type (from experiment_type) + inferred direction (from title/primary_outcome keyword matching). Upserts replication_groups rows with experiment_ids (JSONB), replication_status, agreement_score, and effect_sizes (forest-plot data: per-experiment effect_estimate, 95% CI, weight). Applies a composite_score nudge: +0.05 for replicated, −0.05 for conflicting, +0.01 for inconclusive.
    - get_replication_landscape(entity, db): fetches all replication groups matching an entity, annotates conflict flag, returns effect_sizes and slim experiment details. Falls back to per-experiment listing when no groups exist.

  • api.py — added GET /api/experiments/replication/{entity} endpoint (after the existing priority-queue route). Accepts optional ?refresh=true to re-cluster on demand. Returns entity, groups, conflict_count, replicated_count, single_study_count, experiment_count, summary.
  • Conflict flagging already handled by scidex/agora/debate_trigger.py rule 4 (_rule4_conflicting_replication) — experiments with replication_status='conflicting' are queued as high-priority debate targets.
  • Verified:

    • get_replication_landscape('TREM2') returns 3 groups, 38 experiments, effect_sizes populated.
    • cluster_experiments(db, entity_filter='TREM2') writes 1 group, updates 19 experiments, composite_score nudged +0.05.
    • python3 -m py_compile api.py passes.

    Sibling Tasks in Quest (Experiment Extraction) ↗