Quality Standards & Busywork Prevention

Vision

Centralize quality-over-quantity governance across all SciDEX layers. Define what counts as scientifically meaningful output vs busywork, and enforce those standards via automated scanners and governance review.

Principles

No stubs. Accepted deliverables must not be placeholder pages, empty notebooks, 0-edge entities, or unimplemented agents.

No busywork CI. Recurring tasks that iterate counters, touch timestamps, or run checks with no user-visible output are archived — replaced by monitors only when they surface actionable signal.

Quality over quantity. Each accepted artifact (hypothesis, analysis, challenge, debate, notebook, wiki page) must demonstrate scientific utility and non-obvious insight. Volume without quality is a negative signal.

Showcase orientation. Every layer should produce ≥3 demo-grade showcase examples before scaling breadth.

Parallel agents for scale. Tasks and quests that touch many items (≥10 updates, ≥10 backfills, ≥10 specs) should explicitly invoke parallel agents (3–5 concurrent) rather than sequential loops — both for speed and for diversity of approach.

Per-layer quality bars

Forge: no stub tools; each tool registered must cite ≥1 use in a published analysis/hypothesis within 30 days
Atlas: no 0-edge entities; no <50-word wiki pages; citations > aesthetic edits
Exchange: no trivial tokens or bot-pumped markets; price moves require real evidence
Agora: no <3-turn debates; no one-sided debates; counter-arguments required
Senate: no governance theater; every rule must have enforcement code

Acceptance criteria

Template "Quality Over Quantity Clause" added to 10+ high-risk quests
Automated scanner identifies stubs (<50KB notebooks, <1-citation wiki pages, 0-link artifacts, unimplemented agents) — produces a weekly report
Per-layer quality dashboard: stub ratio, mean artifact size, citation density, parallel-agent usage ratio
Governance approval required for any quest >100 tasks; must declare quality bar upfront

Parallel-agent guideline (copy into new tasks)

When a task operates over ≥10 items (backfill, repair, enrichment, spec creation, etc.), the executing agent should spawn 3–5 parallel sub-agents each handling a disjoint slice. Sequential single-agent execution is only acceptable for ≤10-item tasks or tasks with strict ordering constraints.

When to choose `one_shot` vs `recurring` vs `iterative`

Orchestra supports three task types. Agents writing new tasks (via orchestra create or via spec front-matter) MUST explicitly declare
which one applies. The decision tree:

Single discrete action, no validation loop → one_shot

- Examples: "fix broken link X", "backfill column Y for one table",
"author spec Z", "write parity report section".
- Closes via orchestra task complete after commits land.

Runs forever on a schedule / driver cycle → recurring

- Examples: "CI health check", "link scanner", "market price
re-scorer", "driver queue processor".
- Never ends; the scheduler re-dispatches on the configured
frequency (every-5-min / hourly / every-6h / daily / weekly).
- Zero-commit runs are legal no-ops (the supervisor handles that).

**Has a definite end state but needs multiple claims + a validator

gate** → iterative
- Examples: "port 15 Biomni use cases to SciDEX", "backfill 50
wiki pages each with ≥800 words and 3 citations", "grow
knowledge graph from 5K → 20K edges under quality audit",
"port K-Dense skill suite (10 skills) with acceptance tests".
- Runs across many claim-commit-validate cycles.
- Closes when a separate validator agent (different persona and /
or model) returns verdict=complete N times in a row against
the stated completion_criteria.
- Required front-matter fields:

type: iterative
    max_iterations: <int>        # hard cap; supervisor blocks beyond this
    validator:
      model: claude-opus-4-6
      persona: skeptic           # one of: skeptic|theorist|expert|synthesizer
      min_passes_in_row: 1       # raise to 2+ for high-stakes gates
      validator: debate:skeptic
    completion_criteria:
      <criterion_key>:
        description: <plain-English what done means>
        checks: [ ... ]          # machine-checkable if possible

- The validator reads the append-only iteration_work_log on the
task row plus the completion_criteria_json. Workers MUST NOT
call orchestra task complete on iterative tasks — the
validator gate is the only way they close. See
/home/ubuntu/Orchestra/docs/iterative_tasks.md for the full
state machine.

Quick examples

Goal	Type	Why
Fix hero image on /analysis/12345	`one_shot`	One thing, trivial to verify.
Nightly stub-scanner report	`recurring`	Runs forever, no end state.
Backfill debate traces for 40 analyses to quality ≥0.65	`iterative`	Bounded; each iteration processes a batch; validator checks mean quality.
Port the 15 Biomni use cases	`iterative`	Bounded at 15; per-use-case criteria; Skeptic validator gate.
Daily KG edge-growth monitor	`recurring`	Never ends.
Author governance quest spec	`one_shot`	Single artifact.

Guidance for task creators

If you choose iterative, write the completion_criteria BEFORE

creating the task — the criteria are the contract the validator
enforces. Vague criteria will lead the validator to either
rubber-stamp (if too lenient) or block forever (if impossible).

Set min_passes_in_row >= 2 whenever a single passing verdict

would be insufficient proof (e.g. anything touching real money
markets, KG merges, published artifacts).

Cap max_iterations realistically. Default is 20; lower if the

scope is obviously tighter (e.g. max_iterations=15 for a
15-item port).

Use orchestra task promote-to-quest <id> to escalate an

iterative task into a quest when its scope grows mid-flight.
Never use it to skip the validator gate.

File: quest_quality_standards_spec.md

Modified: 2026-04-24 07:15

Size: 6.1 KB

Quality Standards & Busywork Prevention