[Atlas] Score 10 registered datasets for quality and provenance done

← Atlas
Registered datasets lack quality_score values. Dataset quality scoring supports citation rewards, reuse, and governance. Verification: - 10 datasets have quality_score between 0 and 1 - Scores consider schema completeness, provenance, citations, license, and reuse readiness - Remaining unscored dataset count is reduced Start by reading this task's spec and checking for duplicate recent work.
Spec File

Goal

Populate quality scores for registered datasets so dataset reuse, citation rewards, and governance can prioritize well-documented scientific data. Scores should consider schema completeness, provenance, license clarity, citation coverage, and reuse readiness.

Acceptance Criteria

☐ The selected datasets have quality_score values between 0 and 1
☐ Each score is justified by schema, provenance, citation, license, and reuse checks
☐ No dataset receives a high score without real provenance or schema evidence
☐ The before/after unscored-dataset count is recorded

Approach

  • Inspect registered datasets and their schema_json, canonical_path, license, and citation metadata.
  • Evaluate each dataset against a consistent quality rubric.
  • Persist the score and concise rationale using existing database write patterns.
  • Verify score ranges and count reduction.
  • Dependencies

    • quest-engine-ci - Generates this task when queue depth is low and unscored datasets exist.

    Dependents

    • Dataset citation rewards, quality markets, and Atlas governance depend on dataset quality scores.

    Work Log

    Sibling Tasks in Quest (Atlas) ↗