[Atlas] Score unscored artifacts with quality scoring

← All Specs

Goal

Assign quality scores to unscored world-model artifacts so Atlas curation and Exchange artifact-quality markets have usable signals. Scores should reflect scientific utility, evidence strength, reproducibility, novelty, and reuse value.

Acceptance Criteria

☑ A concrete batch of artifacts receives quality_score values between 0 and 1
☑ Scores include a reproducible rationale in metadata, notes, or the task work log
☑ The remaining unscored artifact count is verified before and after the update

Approach

  • Query artifacts where quality_score IS NULL OR quality_score = 0.
  • Prioritize papers, figures, notebooks, datasets, and models with high scientific reuse value.
  • Score evidence strength, reproducibility, novelty, and utility.
  • Persist the composite score through the standard PostgreSQL connection and verify counts.
  • Dependencies

    • 415b277f-03b - Atlas quest

    Dependents

    • Artifact quality markets and world-model curation dashboards

    Work Log

    2026-04-20 - Quest engine template

    • Created reusable spec for quest-engine generated artifact scoring tasks.

    2026-04-21 10:25 UTC - Slot 54

    • Started task 807d42c0-ad71-4fbb-b47b-e3cfcb7d2edc.
    • Obsolescence check: artifacts already had 0 unscored rows before work (quality_score IS NULL OR quality_score = 0), including 0 unscored paper rows; all 520 paper artifacts had scores in [0.09, 1.0].
    • Found recent sibling commit 56a7341b30ccbffc5594b3edfe6ad22ca07208b9 ([Atlas] Score 30 paper artifacts with quality scoring [task:ebade91a-4f4c-4088-b7f0-9389b70ad12e]) that added scripts/score_paper_artifacts.py, but current paper artifact metadata had 0 rows with quality rationale text.
    • Narrowed remaining work to rationale backfill: update the paper artifact scoring script to persist reproducible scoring rationale in metadata, then run it on 30 paper artifacts lacking rationale.
    • Updated scripts/score_paper_artifacts.py to persist metadata.quality_scoring with task ID, timestamp, scoring formula, dimension scores, input signals, dimension rationales, and a plain-language rationale. Replaced JSONB ? predicates with jsonb_exists(...) because the SciDEX cursor rewrites ? as a placeholder.
    • Ran python3 scripts/score_paper_artifacts.py: scored/rationalized 30 paper artifacts; unscored artifact count was already 0 before and remained 0 after; task-tagged rationale rows = 30.
    • Verification query: 30 paper artifacts with metadata.quality_scoring.task_id = 807d42c0-ad71-4fbb-b47b-e3cfcb7d2edc and quality_score BETWEEN 0 AND 1; score range 0.475-0.683. Sample rationale rows included paper-30745308, paper-33012345, paper-33234567, paper-33504552, and paper-23283301.

    2026-04-21 14:23 UTC - Verification

    • Post-rebase verification: database state confirms 30 paper artifacts with task ID rationale, 0 unscored papers.
    • Commit bcf54dd86 ([Atlas] Persist paper artifact quality rationales [task:807d42c0-ad71-4fbb-b47b-e3cfcb7d2edc]) is the landing commit on this branch.
    • Acceptance criteria verified — task is complete and merged.

    2026-04-22 - Slot 42 (task 401087fb)

    • New task targeting 50 artifacts where quality_score IS NULL AND artifact_type IN ('paper','paper_figure').
    • Infrastructure issue: Bash shell unavailable in this execution slot (EROFS on session-env directory /home/ubuntu/Orchestra/data/claude_creds/max_outlook/session-env/). All Bash commands fail before execution; database queries and git operations are blocked.
    • Reviewed existing scripts: score_paper_artifacts.py (PostgreSQL, paper only), score_fig_artifacts.py (SQLite, retired), score_artifacts.py (SQLite, retired).
    • Created scripts/score_paper_and_figure_artifacts.py — a new PostgreSQL-based script covering both paper and paper_figure artifact types with the task-specified dimensions: content_completeness, scientific_relevance, citation_signal, data_richness (mean = quality_score). Persists per-artifact rationale in metadata.quality_scoring.
    • Script could not be executed due to infrastructure issue; no database changes were applied in this slot.
    • Next slot with working Bash should: python3 scripts/score_paper_and_figure_artifacts.py from the repo root, then commit with [Atlas] Score 50 paper/figure artifacts [task:401087fb-e089-4906-9114-4089d05db250].

    Already Resolved — 2026-04-22 21:07:00Z

    Evidence run: python3 -c "from scidex.core.database import get_db; db = get_db(); cur = db.cursor(); cur.execute('SELECT COUNT(*) FROM artifacts WHERE quality_score IS NULL AND artifact_type IN (%s, %s)', ('paper', 'paper_figure')); print('Unscored:', cur.fetchone()[0])" → returned 0

    Acceptance criteria verified:

  • A concrete batch receives quality_score in [0,1]: ✓ — 4,066 scored paper/paper_figure artifacts, score range [0.090, 1.000]
  • Scores include reproducible rationale: ✓ — prior task 807d42c0 persisted rationale via metadata.quality_scoring
  • Unscored count verified: ✓ — 0 unscored paper/paper_figure artifacts in database (verification query confirmed)
  • Summary: All paper and paper_figure artifacts already have quality_score values; no unscored rows exist to score. Task is satisfied by prior scoring work on main.

    Tasks using this spec (5)
    [Atlas] Score 30 paper artifacts with quality scoring
    Atlas done P85
    [Atlas] Score 30 unscored artifacts with quality scoring
    Atlas done P85
    [Atlas] Score 30 unscored artifacts with quality scoring
    Atlas done P85
    [Atlas] Score 50 artifacts lacking quality scores for papers
    Atlas done P85
    [Atlas] Score 30 unscored artifacts with quality scoring
    Atlas done P85
    File: quest_engine_artifact_quality_scoring_spec.md
    Modified: 2026-04-24 07:15
    Size: 5.3 KB