{"quest":{"id":"q-experiment-extraction","name":"Experiment Extraction","description":"Extract structured experimental findings from papers with full lineage — methods, results, statistics, context","layer":"Atlas","priority":93,"status":"active","created_at":"2026-04-03 23:24:40","updated_at":"2026-04-03 23:24:40"},"tasks":[{"id":"atl-ex-04-QUAL","title":"[Atlas] Extraction quality scoring and confidence calibration","description":"Quality scoring for extracted experiments: completeness, statistical rigor, consistency, calibrated confidence","status":"open","priority":68,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":null,"updated_at":"2026-04-25T07:47:18.847881+00:00","summary":"","completion_notes":"","last_error":"cli-reopen-manual: reopened — task was marked 'done' but has no task_runs row in (done/completed/success)","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/atl-ex-04-QUAL_extraction_quality_scoring_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 5}}"},{"id":"atl-ex-08-API","title":"[Atlas] API endpoints for experiment browsing, search, and filtering","description":"REST + HTML endpoints for experiments: list, detail, entity-based, hypothesis-based queries","status":"open","priority":67,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":null,"updated_at":"2026-04-25T07:47:18.965406+00:00","summary":"","completion_notes":"","last_error":"cli-reopen-manual: reopened — task was marked 'done' but has no task_runs row in (done/completed/success)","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/atl-ex-08-API_experiment_api_endpoints_spec.md","provider":"any","payload_json":"{}"},{"id":"atl-ex-05-REPL","title":"[Atlas] Replication tracking — match experiments testing same hypothesis","description":"Cluster experiments by target/relation/direction, track replication status, flag conflicts","status":"open","priority":66,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":null,"updated_at":"2026-04-25T07:47:18.876207+00:00","summary":"","completion_notes":"","last_error":"cli-reopen-manual: reopened — task was marked 'done' but has no task_runs row in (done/completed/success)","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/atl-ex-05-REPL_replication_tracking_spec.md","provider":"any","payload_json":"{}"},{"id":"atl-ex-06-META","title":"[Atlas] Meta-analysis support — aggregate results across experiments","description":"Pooled effect sizes with inverse-variance weighting, heterogeneity assessment, forest plot data","status":"open","priority":64,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":null,"updated_at":"2026-04-25T07:47:18.882961+00:00","summary":"","completion_notes":"","last_error":"cli-reopen-manual: reopened — task was marked 'done' but has no task_runs row in (done/completed/success)","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/atl-ex-06-META_meta_analysis_support_spec.md","provider":"any","payload_json":"{\"requirements\": {\"coding\": 7, \"reasoning\": 7, \"analysis\": 8}}"},{"id":"atl-ex-07-BKFL","title":"[Atlas] Backfill 188 existing experiment artifacts with structured metadata","description":"Use extraction pipeline to backfill NULL metadata on existing experiment artifacts\n\n\n## REOPENED TASK — CRITICAL CONTEXT\n\nThis task was previously marked 'done' but the audit could not verify\nthe work actually landed on main. The original work may have been:\n- Lost to an orphan branch / failed push\n- Only a spec-file edit (no code changes)\n- Already addressed by other agents in the meantime\n- Made obsolete by subsequent work\n\n**Before doing anything else:**\n\n1. **Re-evaluate the task in light of CURRENT main state.** Read the\n   spec and the relevant files on origin/main NOW. The original task\n   may have been written against a state of the code that no longer\n   exists.\n\n2. **Verify the task still advances SciDEX's aims.** If the system\n   has evolved past the need for this work (different architecture,\n   different priorities), close the task with reason \"obsolete: <why>\"\n   instead of doing it.\n\n3. **Check if it's already done.** Run `git log --grep='<task-id>'`\n   and read the related commits. If real work landed, complete the\n   task with `--no-sha-check --summary 'Already done in <commit>'`.\n\n4. **Make sure your changes don't regress recent functionality.** Many\n   agents have been working on this codebase. Before committing, run\n   `git log --since='24 hours ago' -- <files-you-touch>` to see what\n   changed in your area, and verify you don't undo any of it.\n\n5. **Stay scoped.** Only do what this specific task asks for. Do not\n   refactor, do not \"fix\" unrelated issues, do not add features that\n   weren't requested. Scope creep at this point is regression risk.\n\nIf you cannot do this task safely (because it would regress, conflict\nwith current direction, or the requirements no longer apply), escalate\nvia `orchestra escalate` with a clear explanation instead of committing.\n","status":"done","priority":93,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-16T01:14:02.774430+00:00","updated_at":"2026-04-16T01:14:02.774430+00:00","summary":"","completion_notes":"","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/atl-ex-07-BKFL_backfill_experiment_metadata_spec.md","provider":"any","payload_json":"{\"_stall_skip_providers\": [], \"_stall_requeued_by\": \"minimax\", \"_stall_requeued_at\": \"2026-04-14 04:59:40\", \"_stall_skip_at\": {}, \"_stall_skip_pruned_at\": \"2026-04-14T10:37:14.022390+00:00\"}"},{"id":"atl-ex-03-LINK","title":"[Atlas] Auto-link extracted experiments to KG entities","description":"Entity recognition and fuzzy matching to link experiments to genes, proteins, pathways, diseases in the KG\n\n\n## REOPENED TASK — CRITICAL CONTEXT\n\nThis task was previously marked 'done' but the audit could not verify\nthe work actually landed on main. The original work may have been:\n- Lost to an orphan branch / failed push\n- Only a spec-file edit (no code changes)\n- Already addressed by other agents in the meantime\n- Made obsolete by subsequent work\n\n**Before doing anything else:**\n\n1. **Re-evaluate the task in light of CURRENT main state.** Read the\n   spec and the relevant files on origin/main NOW. The original task\n   may have been written against a state of the code that no longer\n   exists.\n\n2. **Verify the task still advances SciDEX's aims.** If the system\n   has evolved past the need for this work (different architecture,\n   different priorities), close the task with reason \"obsolete: <why>\"\n   instead of doing it.\n\n3. **Check if it's already done.** Run `git log --grep='<task-id>'`\n   and read the related commits. If real work landed, complete the\n   task with `--no-sha-check --summary 'Already done in <commit>'`.\n\n4. **Make sure your changes don't regress recent functionality.** Many\n   agents have been working on this codebase. Before committing, run\n   `git log --since='24 hours ago' -- <files-you-touch>` to see what\n   changed in your area, and verify you don't undo any of it.\n\n5. **Stay scoped.** Only do what this specific task asks for. Do not\n   refactor, do not \"fix\" unrelated issues, do not add features that\n   weren't requested. Scope creep at this point is regression risk.\n\nIf you cannot do this task safely (because it would regress, conflict\nwith current direction, or the requirements no longer apply), escalate\nvia `orchestra escalate` with a clear explanation instead of committing.\n","status":"done","priority":93,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-14T04:09:23.422448+00:00","updated_at":"2026-04-14T04:09:23.422448+00:00","summary":"","completion_notes":"Auto-completed by supervisor after successful deploy to main","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/atl-ex-03-LINK_experiment_kg_entity_linking_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 5}, \"completion_shas\": [\"0fc03c8121fd1220182137fabdfab1765cb56bf7\"], \"completion_shas_checked_at\": \"2026-04-14T04:09:23.401881+00:00\"}"},{"id":"atl-ex-01-SCHM","title":"[Atlas] Define experiment extraction schemas per experiment type","description":"Define JSON schemas for 7 neuroscience experiment types with structured fields for methods, results, statistics, and provenance\n\n\n## REOPENED TASK — CRITICAL CONTEXT\n\nThis task was previously marked 'done' but the audit could not verify\nthe work actually landed on main. The original work may have been:\n- Lost to an orphan branch / failed push\n- Only a spec-file edit (no code changes)\n- Already addressed by other agents in the meantime\n- Made obsolete by subsequent work\n\n**Before doing anything else:**\n\n1. **Re-evaluate the task in light of CURRENT main state.** Read the\n   spec and the relevant files on origin/main NOW. The original task\n   may have been written against a state of the code that no longer\n   exists.\n\n2. **Verify the task still advances SciDEX's aims.** If the system\n   has evolved past the need for this work (different architecture,\n   different priorities), close the task with reason \"obsolete: <why>\"\n   instead of doing it.\n\n3. **Check if it's already done.** Run `git log --grep='<task-id>'`\n   and read the related commits. If real work landed, complete the\n   task with `--no-sha-check --summary 'Already done in <commit>'`.\n\n4. **Make sure your changes don't regress recent functionality.** Many\n   agents have been working on this codebase. Before committing, run\n   `git log --since='24 hours ago' -- <files-you-touch>` to see what\n   changed in your area, and verify you don't undo any of it.\n\n5. **Stay scoped.** Only do what this specific task asks for. Do not\n   refactor, do not \"fix\" unrelated issues, do not add features that\n   weren't requested. Scope creep at this point is regression risk.\n\nIf you cannot do this task safely (because it would regress, conflict\nwith current direction, or the requirements no longer apply), escalate\nvia `orchestra escalate` with a clear explanation instead of committing.\n","status":"done","priority":93,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-14T03:52:05.773376+00:00","updated_at":"2026-04-14T03:52:05.773376+00:00","summary":"","completion_notes":"Auto-completed by supervisor after successful deploy to main","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/atl-ex-01-SCHM_experiment_extraction_schemas_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 5}, \"completion_shas\": [\"f949cb3865b3261f5644d891652a3307ec44dd49\", \"e8278bd573fffd4286a5d02e6a36a6d467d128c0\"], \"completion_shas_checked_at\": \"2026-04-14T03:52:05.754968+00:00\"}"},{"id":"atl-ex-02-PIPE","title":"[Atlas] Build LLM extraction pipeline from paper abstracts and full text","description":"Build Claude-powered pipeline that reads papers and produces structured experiment artifacts with full metadata\n\n\n## REOPENED TASK — CRITICAL CONTEXT\n\nThis task was previously marked 'done' but the audit could not verify\nthe work actually landed on main. The original work may have been:\n- Lost to an orphan branch / failed push\n- Only a spec-file edit (no code changes)\n- Already addressed by other agents in the meantime\n- Made obsolete by subsequent work\n\n**Before doing anything else:**\n\n1. **Re-evaluate the task in light of CURRENT main state.** Read the\n   spec and the relevant files on origin/main NOW. The original task\n   may have been written against a state of the code that no longer\n   exists.\n\n2. **Verify the task still advances SciDEX's aims.** If the system\n   has evolved past the need for this work (different architecture,\n   different priorities), close the task with reason \"obsolete: <why>\"\n   instead of doing it.\n\n3. **Check if it's already done.** Run `git log --grep='<task-id>'`\n   and read the related commits. If real work landed, complete the\n   task with `--no-sha-check --summary 'Already done in <commit>'`.\n\n4. **Make sure your changes don't regress recent functionality.** Many\n   agents have been working on this codebase. Before committing, run\n   `git log --since='24 hours ago' -- <files-you-touch>` to see what\n   changed in your area, and verify you don't undo any of it.\n\n5. **Stay scoped.** Only do what this specific task asks for. Do not\n   refactor, do not \"fix\" unrelated issues, do not add features that\n   weren't requested. Scope creep at this point is regression risk.\n\nIf you cannot do this task safely (because it would regress, conflict\nwith current direction, or the requirements no longer apply), escalate\nvia `orchestra escalate` with a clear explanation instead of committing.\n","status":"done","priority":92,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-16T05:15:03.459697+00:00","updated_at":"2026-04-16T05:15:03.459697+00:00","summary":"","completion_notes":"Work verified on main. Code committed in 5e964d0b6 (281 lines). Squash-merged to main in 3517c2356. Spec work-log in 8861d5442. 428 extracted experiments, 647 experiment artifacts.","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/atl-ex-02-PIPE_experiment_extraction_pipeline_spec.md","provider":"any","payload_json":"{\"requirements\": {\"coding\": 7, \"reasoning\": 6}}"},{"id":"8f7dc2dc-829a-41e0-8824-c5c872cde977","title":"[Atlas] CI: Verify experiment extraction quality metrics and extract from new papers","description":"Check experiment extraction quality: (1) count papers with structured experiments (goal: >500), (2) check field completeness ratio (goal: >80%), (3) check KG entity link precision via sampling, (4) run extraction pipeline on papers added since last run that have 0 experiment records, (5) generate summary report. See: experiment_extractor.py, ci_notebook_coverage.py pattern for CI style. Spec: docs/planning/specs/quest_experiment_extraction_spec.md","status":"blocked","priority":88,"task_type":"recurring","frequency":"weekly","assigned_slot":"","started_at":null,"completed_at":"2026-04-20T19:43:08.797991+00:00","updated_at":"2026-04-24T13:13:27.031417+00:00","summary":"","completion_notes":"Auto-release: recurring task had no work this cycle","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/8f7dc2dc_829_spec.md","provider":"any","payload_json":"{\"requirements\": {\"coding\": 7, \"reasoning\": 6, \"analysis\": 6, \"safety\": 6}}"},{"id":"113042e0-ef07-4f90-9d75-3f17052419dc","title":"[Atlas] Score 30 open knowledge gaps with quality rubric","description":"3104 open knowledge gaps lack gap_quality_score values. Gap quality scores drive prioritization, missions, and debate routing.\n\nVerification:\n- 30 open gaps have gap_quality_score between 0 and 1\n- Specificity, evidence coverage, hypothesis density, debate depth, and actionability are populated where possible\n- Remaining unscored open gap count is <= 3074\n\nStart by reading this task spec and checking for duplicate recent work.","status":"done","priority":84,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T21:43:23.412104+00:00","updated_at":"2026-04-21T21:43:23.412104+00:00","summary":"","completion_notes":"Auto-completed by supervisor after successful deploy to main","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_gap_quality_scoring_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 7, \"reasoning\": 6}}"},{"id":"b51f1e58-f591-4612-b784-883b584aff8b","title":"[Atlas] Score 30 open knowledge gaps with quality rubric","description":"3134 open knowledge gaps lack gap_quality_score values. Gap quality scores drive prioritization, missions, and debate routing.\n\nVerification:\n- 30 open gaps have gap_quality_score between 0 and 1\n- Specificity, evidence coverage, hypothesis density, debate depth, and actionability are populated where possible\n- Remaining unscored open gap count is <= 3104\n\nStart by reading this task's spec and checking for duplicate recent work.","status":"done","priority":84,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T19:18:46.628804+00:00","updated_at":"2026-04-21T19:18:46.628804+00:00","summary":"","completion_notes":"","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_gap_quality_scoring_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 7, \"reasoning\": 6}, \"completion_shas\": [\"a8fbbb11c\", \"95b042f40\", \"83ae9af5e\"], \"completion_shas_checked_at\": \"2026-04-21T19:18:46.604971+00:00\"}"},{"id":"eac11b69-d3b9-4403-948c-ac8b5c2d63eb","title":"[Atlas] Add resolution criteria to 25 open knowledge gaps","description":"3349 open knowledge gaps lack usable resolution criteria. Without criteria, agents cannot tell when a gap has been addressed.\n\nVerification:\n- 25 open gaps gain substantive resolution_criteria\n- Criteria are testable and tied to evidence, debate, dataset, or KG deliverables\n- Remaining open gaps missing criteria is <= 3324\n\nStart by reading this task spec and checking for duplicate recent work.","status":"done","priority":83,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T23:16:55.708408+00:00","updated_at":"2026-04-21T23:16:55.708408+00:00","summary":"","completion_notes":"Auto-completed by supervisor after successful deploy to main","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_gap_resolution_criteria_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 6, \"reasoning\": 6}}"},{"id":"298fdf05-1596-4028-b36d-804c5cd22179","title":"[Atlas] Add pathway diagrams to 20 hypotheses missing mechanism maps","description":"228 active hypotheses lack substantive pathway_diagram values. Mechanism maps make hypotheses inspectable and connect Agora claims to Atlas pathways.\n\nVerification:\n- 20 active hypotheses gain pathway_diagram content grounded in KG edges or cited mechanisms\n- Diagrams render as Mermaid or the existing pathway format without syntax errors\n- Remaining active hypotheses missing pathway diagrams is <= 208\n\nStart by reading this task's spec and checking for duplicate recent work.","status":"done","priority":83,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T21:44:33.515137+00:00","updated_at":"2026-04-21T21:44:33.515137+00:00","summary":"","completion_notes":"Auto-completed by supervisor after successful deploy to main","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_hypothesis_pathway_diagram_backfill_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 6, \"reasoning\": 6}}"},{"id":"32b205dd-086b-4f09-8864-d8398f4a1336","title":"[Atlas] Add resolution criteria to 25 open knowledge gaps","description":"3374 open knowledge gaps lack usable resolution criteria. Without criteria, agents cannot tell when a gap has been addressed.\n\nVerification:\n- 25 open gaps gain substantive resolution_criteria\n- Criteria are testable and tied to evidence, debate, dataset, or KG deliverables\n- Remaining open gaps missing criteria is <= 3349\n\nStart by reading this task's spec and checking for duplicate recent work.","status":"done","priority":83,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T20:42:43.808271+00:00","updated_at":"2026-04-21T20:42:43.808271+00:00","summary":"","completion_notes":"","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_gap_resolution_criteria_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 6, \"reasoning\": 6}}"},{"id":"dc0e6675-c427-4e53-a4df-dbad0f27446a","title":"[Atlas] Add pathway diagrams to 20 hypotheses missing mechanism maps","description":"177 active hypotheses lack substantive pathway_diagram values. Mechanism maps make hypotheses inspectable and connect Agora claims to Atlas pathways.\n\nVerification:\n- 20 active hypotheses gain pathway_diagram content grounded in KG edges or cited mechanisms\n- Diagrams render as Mermaid or the existing pathway format without syntax errors\n- Remaining active hypotheses missing pathway diagrams is <= 157\n\nStart by reading this task's spec and checking for duplicate recent work.","status":"done","priority":83,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T20:41:37.230689+00:00","updated_at":"2026-04-21T20:41:37.230689+00:00","summary":"","completion_notes":"Auto-completed by supervisor after successful deploy to main","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_hypothesis_pathway_diagram_backfill_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 6, \"reasoning\": 6}}"},{"id":"a59648bb-e89a-4eba-aba1-327e82ba29bd","title":"[Atlas] Link 50 evidence entries to target artifacts","description":"9826 evidence entries have no evidence_links rows. Evidence links make support and contradiction navigable across hypotheses, papers, datasets, and KG entities.\n\nVerification:\n- 50 evidence entries gain evidence_links rows or documented no-target rationale\n- Each link has target_type, target_id, link_type, and strength grounded in the evidence entry\n- Remaining unlinked evidence entry count is <= 9776\n\nStart by reading this task's spec and checking for duplicate recent work.","status":"done","priority":81,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T22:34:24.081940+00:00","updated_at":"2026-04-21T22:34:24.081940+00:00","summary":"","completion_notes":"","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_evidence_link_backfill_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 6, \"reasoning\": 6}}"},{"id":"7cfc4b69-7332-4ac6-a117-32c570c45fad","title":"[Atlas] Extract figures from 30 papers missing figure metadata","description":"18562 papers have figures_extracted = 0. Figure metadata improves paper inspection, visual artifacts, and evidence review.\n\nVerification:\n- 30 papers have figures_extracted = 1 or documented no-figure/provider-skip metadata\n- Extracted figures include captions and paper provenance where available\n- Remaining papers without figure extraction is <= 18532\n\nStart by reading this task's spec and checking for duplicate recent work.","status":"done","priority":81,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T21:05:36.777826+00:00","updated_at":"2026-04-21T21:05:36.777826+00:00","summary":"","completion_notes":"Auto-completed by supervisor after successful deploy to main","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_paper_figure_extraction_backfill_spec.md","provider":"any","payload_json":"{\"requirements\": {\"coding\": 6, \"analysis\": 5}}"},{"id":"97023181-8d36-4cce-8d84-6489a7432d72","title":"[Atlas] Link 50 evidence entries to target artifacts","description":"9876 evidence entries have no evidence_links rows. Evidence links make support and contradiction navigable across hypotheses, papers, datasets, and KG entities.\n\nVerification:\n- 50 evidence entries gain evidence_links rows or documented no-target rationale\n- Each link has target_type, target_id, link_type, and strength grounded in the evidence entry\n- Remaining unlinked evidence entry count is <= 9826\n\nStart by reading this task's spec and checking for duplicate recent work.","status":"done","priority":81,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T20:34:14.986203+00:00","updated_at":"2026-04-21T20:34:14.986203+00:00","summary":"","completion_notes":"","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_evidence_link_backfill_spec.md","provider":"any","payload_json":"{\"requirements\": {\"analysis\": 6, \"reasoning\": 6}, \"completion_shas\": [\"e7c3cb597\"], \"completion_shas_checked_at\": \"2026-04-21T20:34:14.951906+00:00\"}"},{"id":"1bd49424-d51c-4888-b9e0-7a0ae45128e7","title":"[Atlas] Extract figures from 30 papers missing figure metadata","description":"17602 papers have figures_extracted = 0. Figure metadata improves paper inspection, visual artifacts, and evidence review.\n\nVerification:\n- 30 papers have figures_extracted = 1 or documented no-figure/provider-skip metadata\n- Extracted figures include captions and paper provenance where available\n- Remaining papers without figure extraction is <= 17572\n\nStart by reading this task's spec and checking for duplicate recent work.","status":"done","priority":81,"task_type":"one_shot","frequency":"","assigned_slot":"","started_at":null,"completed_at":"2026-04-21T19:52:39.991607+00:00","updated_at":"2026-04-21T19:52:39.991607+00:00","summary":"","completion_notes":"","last_error":"","time_estimate_hours":0.0,"completion_count":0,"spec_path":"docs/planning/specs/quest_engine_paper_figure_extraction_backfill_spec.md","provider":"any","payload_json":"{\"requirements\": {\"coding\": 6, \"analysis\": 5}}"}],"reviews":[],"effectiveness":{},"spec_content":"---\ntitle: \"Quest: Experiment Extraction & Evidence Atoms\"\ndescription: \"Extract structured experimental findings from papers with full lineage — methods, results, statistics, context — as rich artifacts that anchor all evidence chains\"\ntype: quest\nlayer: Atlas\npriority: 93\nstatus: active\nquest_id: q-experiment-extraction\nspec_path: docs/planning/specs/quest_experiment_extraction_spec.md\n---\n\n# Quest: Experiment Extraction & Evidence Atoms\n\n**Layer:** Atlas\n**Priority:** P93\n**Status:** active\n\n## Vision\n\nPapers are the bedrock of scientific evidence, but SciDEX currently treats them as opaque\nblobs — a title, abstract, PMID, and maybe some JSON evidence claims. We have 520 papers\nand 188 experiment artifacts, but the experiment artifacts have **no structured metadata**.\n\nThis quest transforms paper-derived knowledge from unstructured citations into **rich,\nstructured experiment records** — each one a first-class artifact with full lineage:\n\n- **What was done**: experimental design, model system, methods, controls\n- **What was found**: measurements, p-values, effect sizes, confidence intervals, sample sizes\n- **What it means**: conclusions drawn, limitations acknowledged, context within the field\n- **Where it came from**: paper source (PMID, section, figure/table references)\n- **How it was extracted**: which agent, what methodology, extraction confidence\n\nThese structured experiments become the **ground truth anchors** for SciDEX's entire\nevidence system. When a hypothesis claims \"TREM2 variants increase AD risk by 2-4x\",\nthe evidence chain traces through an experiment artifact that contains the actual\nodds ratio, confidence interval, sample size, and study design.\n\n### Why This Matters\n\n1. **Evidence grounding**: Claims without structured experiment backing are unverifiable\n2. **Replication tracking**: Multiple experiments testing the same hypothesis can be compared\n3. **Meta-analysis**: Structured results enable systematic aggregation across studies\n4. **Debate quality**: Skeptic agents can challenge specific methodological details\n5. **KG enrichment**: Extracted entities, relations, and measurements grow the Atlas\n\n### Neuro Focus (Preventing Sprawl)\n\nExtraction schemas are scoped to neuroscience-relevant experiment types:\n- **Genetic association** (GWAS, candidate gene studies, Mendelian genetics)\n- **Protein interaction** (co-IP, mass spec, yeast two-hybrid, proximity labeling)\n- **Gene expression** (RNA-seq, qPCR, microarray, single-cell)\n- **Animal model** (transgenic mice, behavioral assays, histology)\n- **Cell biology** (cell culture, organoids, iPSC-derived neurons)\n- **Clinical** (biomarker studies, imaging, cognitive assessments, drug trials)\n- **Neuropathology** (histology, immunostaining, electron microscopy)\n\nAdditional types can be added through Schema Governance (q-schema-governance).\n\n## Experiment Artifact Schema\n\n```python\n# Type-specific metadata for artifact_type='experiment'\n{\n    # Source provenance\n    \"paper_pmid\": \"12345678\",\n    \"paper_section\": \"Results, Figure 3A\",\n    \"paper_doi\": \"10.1038/s41586-023-...\",\n    \n    # Experimental design\n    \"experiment_type\": \"genetic_association\",  # From controlled vocabulary\n    \"model_system\": \"human cohort\",            # mouse, rat, human, cell_line, organoid, etc.\n    \"species\": \"Homo sapiens\",\n    \"tissue\": \"prefrontal cortex\",\n    \"sample_size\": 1500,\n    \"control_description\": \"Age-matched healthy controls (n=800)\",\n    \"methods_summary\": \"Genome-wide association study of AD risk variants...\",\n    \n    # Results (structured)\n    \"results\": {\n        \"primary_finding\": \"TREM2 R47H variant associated with increased AD risk\",\n        \"measurements\": [\n            {\n                \"metric\": \"odds_ratio\",\n                \"value\": 2.92,\n                \"ci_lower\": 2.09,\n                \"ci_upper\": 4.09,\n                \"p_value\": 3.4e-12,\n                \"comparison\": \"R47H carriers vs non-carriers\"\n            }\n        ],\n        \"effect_direction\": \"risk_increasing\",\n        \"replication_status\": \"replicated\"  # replicated, not_replicated, awaiting, conflicting\n    },\n    \n    # Context\n    \"disease_context\": \"Alzheimer's disease\",\n    \"entities_mentioned\": [\"TREM2\", \"R47H\", \"microglia\", \"neuroinflammation\"],\n    \"conclusions\": \"TREM2 R47H is a significant risk factor for late-onset AD...\",\n    \"limitations\": \"European ancestry cohort only; effect size may vary...\",\n    \n    # Extraction provenance\n    \"extracted_by\": \"agent-atlas-extractor\",\n    \"extraction_method\": \"llm_structured_extraction\",\n    \"extraction_confidence\": 0.85,\n    \"extraction_timestamp\": \"2026-04-03T12:00:00Z\",\n    \"human_verified\": false\n}\n```\n\n## Open Tasks\n\n- [ ] atl-ex-01-SCHM: Define experiment extraction schemas per experiment type (P93)\n- [ ] atl-ex-02-PIPE: Build LLM extraction pipeline from paper abstracts/full text (P92)\n- [ ] atl-ex-03-LINK: Auto-link extracted experiments to KG entities (P90)\n- [ ] atl-ex-04-QUAL: Extraction quality scoring and confidence calibration (P88)\n- [ ] atl-ex-05-REPL: Replication tracking — match experiments testing same hypothesis (P86)\n- [ ] atl-ex-06-META: Meta-analysis support — aggregate results across experiments (P84)\n- [ ] atl-ex-07-BKFL: Backfill 188 existing experiment artifacts with structured metadata (P91)\n- [ ] atl-ex-08-API: API endpoints for experiment browsing, search, and filtering (P87)\n\n## Dependency Chain\n\n```\natl-ex-01-SCHM (Schema definition)\n    ↓\natl-ex-02-PIPE (Extraction pipeline) ──→ atl-ex-07-BKFL (Backfill existing)\n    ↓\natl-ex-03-LINK (KG entity linking)\n    ↓\natl-ex-04-QUAL (Quality scoring) ──→ atl-ex-05-REPL (Replication tracking)\n    ↓                                        ↓\natl-ex-08-API (API endpoints)        atl-ex-06-META (Meta-analysis)\n```\n\n## Integration Points\n\n- **Evidence Chains** (b5298ea7): Extracted experiments become ground-truth evidence entries\n- **Knowledge Units** (08c73de3): Experiment results become atomic, composable evidence blocks\n- **Artifact Debates** (q-artifact-debates): Experiments are debatable — methodology can be challenged\n- **Schema Governance** (q-schema-governance): Experiment schemas evolve through governance\n- **Epistemic Rigor** (q-epistemic-rigor): Experiments anchor the falsifiability chain\n\n## Hypothesis Ranking Feedback\n\nExtracted experiments create a **bidirectional scoring relationship** with hypotheses:\n\n1. **Hypothesis → Experiment**: Hypotheses with explicit falsifiable predictions attract experiment design tasks. The system should proactively generate experiment proposals for high-scoring hypotheses that lack associated experiments.\n\n2. **Experiment → Hypothesis**: An experiment's quality scores (feasibility, impact, information gain) feed back into the linked hypothesis's composite score:\n   - Hypotheses with feasible, high-impact associated experiments rank higher\n   - Hypotheses with no testable experiments are penalized in relative ranking\n   - When experiment results confirm or falsify predictions, Bayesian updates adjust hypothesis confidence\n\n3. **Experiment Quality Dimensions** (for ranking feedback):\n   - **Feasibility** (0-1): Can this experiment actually be executed with available resources?\n   - **Impact** (0-1): How much would the result change our world model?\n   - **Information gain** (0-1): How much uncertainty does this experiment resolve?\n   - **Novelty** (0-1): Does this test something not yet tested?\n\nThese dimensions should be computed during experiment extraction (atl-ex-04-QUAL) and stored in experiment metadata for consumption by the hypothesis scoring pipeline.\n\n## Success Criteria\n\n- [ ] >500 papers have at least one structured experiment extracted\n- [ ] Experiment artifacts have >80% field completeness (non-null structured metadata)\n- [ ] Extracted experiments link to KG entities with >90% precision\n- [ ] Extraction confidence correlates with human verification (calibration)\n- [ ] Replication tracking identifies conflicting results for >10 hypothesis pairs\n\n## Work Log\n\n_No entries yet._\n","spec_html":"<div style=\"font-size:0.85rem\"><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h2 style=\"color:#4fc3f7;margin:1.5rem 0 0.6rem;font-size:1.2rem;font-weight:700\">Quest: Experiment Extraction &amp; Evidence Atoms</h2></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><strong style=\"color:#e0e0e0\">Layer:</strong> Atlas\n<strong style=\"color:#e0e0e0\">Priority:</strong> P93\n<strong style=\"color:#e0e0e0\">Status:</strong> active</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h3 style=\"color:#4fc3f7;margin:1.4rem 0 0.5rem;font-size:1.1rem;font-weight:700;border-bottom:2px solid rgba(79,195,247,0.3);padding-bottom:0.2rem\">Vision</h3></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\">Papers are the bedrock of scientific evidence, but SciDEX currently treats them as opaque<br>blobs — a title, abstract, PMID, and maybe some JSON evidence claims. We have 520 papers<br>and 188 experiment artifacts, but the experiment artifacts have <strong style=\"color:#e0e0e0\">no structured metadata</strong>.</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\">This quest transforms paper-derived knowledge from unstructured citations into **rich,<br>structured experiment records** — each one a first-class artifact with full lineage:</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><ul style=\"padding-left:1.5rem;margin:0.4rem 0\"><li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">What was done</strong>: experimental design, model system, methods, controls</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">What was found</strong>: measurements, p-values, effect sizes, confidence intervals, sample sizes</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">What it means</strong>: conclusions drawn, limitations acknowledged, context within the field</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Where it came from</strong>: paper source (PMID, section, figure/table references)</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">How it was extracted</strong>: which agent, what methodology, extraction confidence</li>\n</ul><br>These structured experiments become the <strong style=\"color:#e0e0e0\">ground truth anchors</strong> for SciDEX&#x27;s entire<br>evidence system. When a hypothesis claims &quot;TREM2 variants increase AD risk by 2-4x&quot;,<br>the evidence chain traces through an experiment artifact that contains the actual<br>odds ratio, confidence interval, sample size, and study design.</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h4 style=\"color:#e0e0e0;margin:1.2rem 0 0.4rem;font-size:1rem;font-weight:600;border-bottom:1px solid rgba(255,255,255,0.08);padding-bottom:0.2rem\">Why This Matters</h4></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Evidence grounding</strong>: Claims without structured experiment backing are unverifiable</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Replication tracking</strong>: Multiple experiments testing the same hypothesis can be compared</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Meta-analysis</strong>: Structured results enable systematic aggregation across studies</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Debate quality</strong>: Skeptic agents can challenge specific methodological details</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">KG enrichment</strong>: Extracted entities, relations, and measurements grow the Atlas</li></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h4 style=\"color:#e0e0e0;margin:1.2rem 0 0.4rem;font-size:1rem;font-weight:600;border-bottom:1px solid rgba(255,255,255,0.08);padding-bottom:0.2rem\">Neuro Focus (Preventing Sprawl)</h4></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\">Extraction schemas are scoped to neuroscience-relevant experiment types:\n<ul style=\"padding-left:1.5rem;margin:0.4rem 0\"><li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Genetic association</strong> (GWAS, candidate gene studies, Mendelian genetics)</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Protein interaction</strong> (co-IP, mass spec, yeast two-hybrid, proximity labeling)</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Gene expression</strong> (RNA-seq, qPCR, microarray, single-cell)</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Animal model</strong> (transgenic mice, behavioral assays, histology)</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Cell biology</strong> (cell culture, organoids, iPSC-derived neurons)</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Clinical</strong> (biomarker studies, imaging, cognitive assessments, drug trials)</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Neuropathology</strong> (histology, immunostaining, electron microscopy)</li>\n</ul><br>Additional types can be added through Schema Governance (q-schema-governance).</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h3 style=\"color:#4fc3f7;margin:1.4rem 0 0.5rem;font-size:1.1rem;font-weight:700;border-bottom:2px solid rgba(79,195,247,0.3);padding-bottom:0.2rem\">Experiment Artifact Schema</h3></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><pre style=\"background:#0a0a14;padding:0.8rem;border-radius:6px;border:1px solid rgba(79,195,247,0.15);color:#e0e0e0;font-size:0.8rem;overflow-x:auto;margin:0.5rem 0;line-height:1.5\"><code># Type-specific metadata for artifact_type=&#x27;experiment&#x27;\n{\n    # Source provenance\n    &quot;paper_pmid&quot;: &quot;12345678&quot;,\n    &quot;paper_section&quot;: &quot;Results, Figure 3A&quot;,\n    &quot;paper_doi&quot;: &quot;10.1038/s41586-023-...&quot;,\n    \n    # Experimental design\n    &quot;experiment_type&quot;: &quot;genetic_association&quot;,  # From controlled vocabulary\n    &quot;model_system&quot;: &quot;human cohort&quot;,            # mouse, rat, human, cell_line, organoid, etc.\n    &quot;species&quot;: &quot;Homo sapiens&quot;,\n    &quot;tissue&quot;: &quot;prefrontal cortex&quot;,\n    &quot;sample_size&quot;: 1500,\n    &quot;control_description&quot;: &quot;Age-matched healthy controls (n=800)&quot;,\n    &quot;methods_summary&quot;: &quot;Genome-wide association study of AD risk variants...&quot;,\n    \n    # Results (structured)\n    &quot;results&quot;: {\n        &quot;primary_finding&quot;: &quot;TREM2 R47H variant associated with increased AD risk&quot;,\n        &quot;measurements&quot;: [\n            {\n                &quot;metric&quot;: &quot;odds_ratio&quot;,\n                &quot;value&quot;: 2.92,\n                &quot;ci_lower&quot;: 2.09,\n                &quot;ci_upper&quot;: 4.09,\n                &quot;p_value&quot;: 3.4e-12,\n                &quot;comparison&quot;: &quot;R47H carriers vs non-carriers&quot;\n            }\n        ],\n        &quot;effect_direction&quot;: &quot;risk_increasing&quot;,\n        &quot;replication_status&quot;: &quot;replicated&quot;  # replicated, not_replicated, awaiting, conflicting\n    },\n    \n    # Context\n    &quot;disease_context&quot;: &quot;Alzheimer&#x27;s disease&quot;,\n    &quot;entities_mentioned&quot;: [&quot;TREM2&quot;, &quot;R47H&quot;, &quot;microglia&quot;, &quot;neuroinflammation&quot;],\n    &quot;conclusions&quot;: &quot;TREM2 R47H is a significant risk factor for late-onset AD...&quot;,\n    &quot;limitations&quot;: &quot;European ancestry cohort only; effect size may vary...&quot;,\n    \n    # Extraction provenance\n    &quot;extracted_by&quot;: &quot;agent-atlas-extractor&quot;,\n    &quot;extraction_method&quot;: &quot;llm_structured_extraction&quot;,\n    &quot;extraction_confidence&quot;: 0.85,\n    &quot;extraction_timestamp&quot;: &quot;2026-04-03T12:00:00Z&quot;,\n    &quot;human_verified&quot;: false\n}</code></pre></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h3 style=\"color:#4fc3f7;margin:1.4rem 0 0.5rem;font-size:1.1rem;font-weight:700;border-bottom:2px solid rgba(79,195,247,0.3);padding-bottom:0.2rem\">Open Tasks</h3></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><div style=\"margin:0.2rem 0;color:#bbb\">&#9744; atl-ex-01-SCHM: Define experiment extraction schemas per experiment type (P93)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; atl-ex-02-PIPE: Build LLM extraction pipeline from paper abstracts/full text (P92)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; atl-ex-03-LINK: Auto-link extracted experiments to KG entities (P90)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; atl-ex-04-QUAL: Extraction quality scoring and confidence calibration (P88)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; atl-ex-05-REPL: Replication tracking — match experiments testing same hypothesis (P86)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; atl-ex-06-META: Meta-analysis support — aggregate results across experiments (P84)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; atl-ex-07-BKFL: Backfill 188 existing experiment artifacts with structured metadata (P91)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; atl-ex-08-API: API endpoints for experiment browsing, search, and filtering (P87)</div></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h3 style=\"color:#4fc3f7;margin:1.4rem 0 0.5rem;font-size:1.1rem;font-weight:700;border-bottom:2px solid rgba(79,195,247,0.3);padding-bottom:0.2rem\">Dependency Chain</h3></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><pre style=\"background:#0a0a14;padding:0.8rem;border-radius:6px;border:1px solid rgba(79,195,247,0.15);color:#e0e0e0;font-size:0.8rem;overflow-x:auto;margin:0.5rem 0;line-height:1.5\"><code>atl-ex-01-SCHM (Schema definition)\n    ↓\natl-ex-02-PIPE (Extraction pipeline) ──→ atl-ex-07-BKFL (Backfill existing)\n    ↓\natl-ex-03-LINK (KG entity linking)\n    ↓\natl-ex-04-QUAL (Quality scoring) ──→ atl-ex-05-REPL (Replication tracking)\n    ↓                                        ↓\natl-ex-08-API (API endpoints)        atl-ex-06-META (Meta-analysis)</code></pre></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h3 style=\"color:#4fc3f7;margin:1.4rem 0 0.5rem;font-size:1.1rem;font-weight:700;border-bottom:2px solid rgba(79,195,247,0.3);padding-bottom:0.2rem\">Integration Points</h3></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><ul style=\"padding-left:1.5rem;margin:0.4rem 0\"><li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Evidence Chains</strong> (b5298ea7): Extracted experiments become ground-truth evidence entries</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Knowledge Units</strong> (08c73de3): Experiment results become atomic, composable evidence blocks</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Artifact Debates</strong> (q-artifact-debates): Experiments are debatable — methodology can be challenged</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Schema Governance</strong> (q-schema-governance): Experiment schemas evolve through governance</li>\n<li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Epistemic Rigor</strong> (q-epistemic-rigor): Experiments anchor the falsifiability chain</li>\n</ul>\n<h3 style=\"color:#4fc3f7;margin:1.4rem 0 0.5rem;font-size:1.1rem;font-weight:700;border-bottom:2px solid rgba(79,195,247,0.3);padding-bottom:0.2rem\">Hypothesis Ranking Feedback</h3></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\">Extracted experiments create a <strong style=\"color:#e0e0e0\">bidirectional scoring relationship</strong> with hypotheses:</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Hypothesis → Experiment</strong>: Hypotheses with explicit falsifiable predictions attract experiment design tasks. The system should proactively generate experiment proposals for high-scoring hypotheses that lack associated experiments.</li></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Experiment → Hypothesis</strong>: An experiment&#x27;s quality scores (feasibility, impact, information gain) feed back into the linked hypothesis&#x27;s composite score:</li>\n   - Hypotheses with feasible, high-impact associated experiments rank higher<br>   - Hypotheses with no testable experiments are penalized in relative ranking<br>   - When experiment results confirm or falsify predictions, Bayesian updates adjust hypothesis confidence</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><li style=\"margin:0.15rem 0;color:#bbb\"><strong style=\"color:#e0e0e0\">Experiment Quality Dimensions</strong> (for ranking feedback):</li>\n   - <strong style=\"color:#e0e0e0\">Feasibility</strong> (0-1): Can this experiment actually be executed with available resources?<br>   - <strong style=\"color:#e0e0e0\">Impact</strong> (0-1): How much would the result change our world model?<br>   - <strong style=\"color:#e0e0e0\">Information gain</strong> (0-1): How much uncertainty does this experiment resolve?<br>   - <strong style=\"color:#e0e0e0\">Novelty</strong> (0-1): Does this test something not yet tested?</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\">These dimensions should be computed during experiment extraction (atl-ex-04-QUAL) and stored in experiment metadata for consumption by the hypothesis scoring pipeline.</p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h3 style=\"color:#4fc3f7;margin:1.4rem 0 0.5rem;font-size:1.1rem;font-weight:700;border-bottom:2px solid rgba(79,195,247,0.3);padding-bottom:0.2rem\">Success Criteria</h3></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><div style=\"margin:0.2rem 0;color:#bbb\">&#9744; &gt;500 papers have at least one structured experiment extracted</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; Experiment artifacts have &gt;80% field completeness (non-null structured metadata)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; Extracted experiments link to KG entities with &gt;90% precision</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; Extraction confidence correlates with human verification (calibration)</div>\n<div style=\"margin:0.2rem 0;color:#bbb\">&#9744; Replication tracking identifies conflicting results for &gt;10 hypothesis pairs</div></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\"><h3 style=\"color:#4fc3f7;margin:1.4rem 0 0.5rem;font-size:1.1rem;font-weight:700;border-bottom:2px solid rgba(79,195,247,0.3);padding-bottom:0.2rem\">Work Log</h3></p><p style=\"color:#bbb;line-height:1.6;margin:0.4rem 0\">_No entries yet._<br></p></div>","spec_file":"quest_experiment_extraction_spec.md","commits":[{"hash":"59405c7c5","message":"docs: AGENTS.md — document Path A/B/C task completion semantics [task:docs-agents-completion] (#40)","date":"2026-04-25"},{"hash":"e5b5848a0","message":"WIP on orchestra/task/8fcc8dc8-debate-artifact-version-pinning-referenc: 8a24c2fa2 [Senate] Delete broken restore_database.sh (#38)","date":"2026-04-25"},{"hash":"50e5ffcfe","message":"index on orchestra/task/8fcc8dc8-debate-artifact-version-pinning-referenc: 8a24c2fa2 [Senate] Delete broken restore_database.sh (#38)","date":"2026-04-25"},{"hash":"0d37f5fce","message":"untracked files on orchestra/task/8fcc8dc8-debate-artifact-version-pinning-referenc: 8a24c2fa2 [Senate] Delete broken restore_database.sh (#38)","date":"2026-04-25"},{"hash":"48f8d2fe3","message":"feat: surface all five SciDEX layers in nav [task:cba19c94-1724-4d5a-b89d-96c73c25f12a] (#39)","date":"2026-04-25"},{"hash":"1f0e35929","message":"Squash merge: orchestra/task/b1a8e549-cross-cutting-wire-existing-k-dense-skil (2 commits)","date":"2026-04-25"},{"hash":"ddb7db381","message":"[Agora] Wire existing K-Dense-backed tools into debate orchestration [task:b1a8e549-6f31-43c5-80f5-7c4717c267e4]","date":"2026-04-25"},{"hash":"76b71427a","message":"[Agora] Wire existing K-Dense-backed tools into debate orchestration [task:b1a8e549-6f31-43c5-80f5-7c4717c267e4]","date":"2026-04-25"},{"hash":"779e85c3a","message":"[Senate] Verify /resources dashboard complete; check off acceptance criteria [task:82074adc-507f-4e6b-9092-e2ceee79e7d4]","date":"2026-04-25"},{"hash":"4c66a8e09","message":"[Senate] Establish emergency access recovery procedures [task:e643cdd3-afd6-410f-a366-a6297d112127]","date":"2026-04-25"},{"hash":"7265a06b4","message":"Squash merge: orchestra/task/b1a8e549-cross-cutting-wire-existing-k-dense-skil (1 commits)","date":"2026-04-25"},{"hash":"58406ec64","message":"[Atlas] Dashboard artifact type: living web views with data source rendering [task:a17-28-DASH0001]","date":"2026-04-25"},{"hash":"8a24c2fa2","message":"[Senate] Delete broken restore_database.sh (#38)","date":"2026-04-25"},{"hash":"b98a1fa18","message":"[Senate] Delete broken restore_database.sh","date":"2026-04-25"},{"hash":"e846f82ef","message":"[Senate] Refresh BACKUP_RESTORE.md + docs/runbooks/emergency_restore.md (#37)","date":"2026-04-25"},{"hash":"43972a45e","message":"[Senate] Refresh BACKUP_RESTORE.md + docs/runbooks/emergency_restore.md","date":"2026-04-25"},{"hash":"2c7dbfe7f","message":"[Senate] Delete 9 obsolete backup scripts/units (continuation of Phase A-D cleanup) (#36)","date":"2026-04-25"},{"hash":"9743eb298","message":"[Senate] Delete 9 obsolete backup scripts/units (continuation of Phase A-D cleanup)","date":"2026-04-25"},{"hash":"3e72d8383","message":"[Agora] Wire 3 missing tools into debate skill_functions, fix citation persistence bug [task:b1a8e549-6f31-43c5-80f5-7c4717c267e4]","date":"2026-04-25"},{"hash":"4310e9854","message":"[Demo] Work log: figures verified complete — 140/140 analyses covered [task:df201d8f-4b89-4258-9148-eb1028fc1fbd]","date":"2026-04-24"}]}