[Atlas] Score registered datasets for quality and provenance

← All Specs

Goal

Populate quality scores for registered datasets so dataset reuse, citation rewards, and governance can prioritize well-documented scientific data. Scores should consider schema completeness, provenance, license clarity, citation coverage, and reuse readiness.

Acceptance Criteria

☐ The selected datasets have quality_score values between 0 and 1
☐ Each score is justified by schema, provenance, citation, license, and reuse checks
☐ No dataset receives a high score without real provenance or schema evidence
☐ The before/after unscored-dataset count is recorded

Approach

  • Inspect registered datasets and their schema_json, canonical_path, license, and citation metadata.
  • Evaluate each dataset against a consistent quality rubric.
  • Persist the score and concise rationale using existing database write patterns.
  • Verify score ranges and count reduction.
  • Dependencies

    • quest-engine-ci - Generates this task when queue depth is low and unscored datasets exist.

    Dependents

    • Dataset citation rewards, quality markets, and Atlas governance depend on dataset quality scores.

    Work Log

    Tasks using this spec (5)
    [Atlas] Score 8 registered datasets for quality and provenan
    [Atlas] Score 8 registered datasets for quality and provenan
    Atlas done P79
    [Atlas] Score 10 registered datasets for quality and provena
    Atlas done P79
    [Atlas] Score 10 registered datasets for quality and provena
    Atlas done P79
    [Atlas] Score 8 registered datasets for quality and provenan
    Atlas open P62
    File: quest_engine_dataset_quality_scoring_spec.md
    Modified: 2026-04-24 07:15
    Size: 1.3 KB