Quality Standards & Busywork Prevention
Vision
Centralize quality-over-quantity governance across all SciDEX layers. Define what counts as scientifically meaningful output vs busywork, and enforce those standards via automated scanners and governance review.
Principles
No stubs. Accepted deliverables must not be placeholder pages, empty notebooks, 0-edge entities, or unimplemented agents.
No busywork CI. Recurring tasks that iterate counters, touch timestamps, or run checks with no user-visible output are archived — replaced by monitors only when they surface actionable signal.
Quality over quantity. Each accepted artifact (hypothesis, analysis, challenge, debate, notebook, wiki page) must demonstrate scientific utility and non-obvious insight. Volume without quality is a negative signal.
Showcase orientation. Every layer should produce ≥3 demo-grade showcase examples before scaling breadth.
Parallel agents for scale. Tasks and quests that touch many items (≥10 updates, ≥10 backfills, ≥10 specs) should explicitly invoke parallel agents (3–5 concurrent) rather than sequential loops — both for speed and for diversity of approach.Per-layer quality bars
- Forge: no stub tools; each tool registered must cite ≥1 use in a published analysis/hypothesis within 30 days
- Atlas: no 0-edge entities; no <50-word wiki pages; citations > aesthetic edits
- Exchange: no trivial tokens or bot-pumped markets; price moves require real evidence
- Agora: no <3-turn debates; no one-sided debates; counter-arguments required
- Senate: no governance theater; every rule must have enforcement code
Acceptance criteria
- Template "Quality Over Quantity Clause" added to 10+ high-risk quests
- Automated scanner identifies stubs (<50KB notebooks, <1-citation wiki pages, 0-link artifacts, unimplemented agents) — produces a weekly report
- Per-layer quality dashboard: stub ratio, mean artifact size, citation density, parallel-agent usage ratio
- Governance approval required for any quest >100 tasks; must declare quality bar upfront
Parallel-agent guideline (copy into new tasks)
When a task operates over ≥10 items (backfill, repair, enrichment, spec creation, etc.), the executing agent should spawn 3–5 parallel sub-agents each handling a disjoint slice. Sequential single-agent execution is only acceptable for ≤10-item tasks or tasks with strict ordering constraints.
When to choose one_shot vs recurring vs iterative
Orchestra supports three task types. Agents writing new tasks (via
orchestra create or via spec front-matter) MUST explicitly declare
which one applies. The decision tree:
- Single discrete action, no validation loop →
one_shot
- Examples: "fix broken link X", "backfill column Y for one table",
"author spec Z", "write parity report section".
- Closes via
orchestra task complete after commits land.
- Runs forever on a schedule / driver cycle →
recurring
- Examples: "CI health check", "link scanner", "market price
re-scorer", "driver queue processor".
- Never ends; the scheduler re-dispatches on the configured
frequency (every-5-min / hourly / every-6h / daily / weekly).
- Zero-commit runs are legal no-ops (the supervisor handles that).
- **Has a definite end state but needs multiple claims + a validator
gate** →
iterative - Examples: "port 15 Biomni use cases to SciDEX", "backfill 50
wiki pages each with ≥800 words and 3 citations", "grow
knowledge graph from 5K → 20K edges under quality audit",
"port K-Dense skill suite (10 skills) with acceptance tests".
- Runs across many claim-commit-validate cycles.
- Closes when a separate validator agent (different persona and /
or model) returns
verdict=complete N times in a row against
the stated
completion_criteria.
- Required front-matter fields:
type: iterative
max_iterations: <int> # hard cap; supervisor blocks beyond this
validator:
model: claude-opus-4-6
persona: skeptic # one of: skeptic|theorist|expert|synthesizer
min_passes_in_row: 1 # raise to 2+ for high-stakes gates
validator: debate:skeptic
completion_criteria:
<criterion_key>:
description: <plain-English what done means>
checks: [ ... ] # machine-checkable if possible
- The validator reads the append-only iteration_work_log on the
task row plus the completion_criteria_json. Workers MUST NOT
call orchestra task complete on iterative tasks — the
validator gate is the only way they close. See
/home/ubuntu/Orchestra/docs/iterative_tasks.md for the full
state machine.
Quick examples
| Goal | Type | Why |
|---|
| Fix hero image on /analysis/12345 | one_shot | One thing, trivial to verify. |
| Nightly stub-scanner report | recurring | Runs forever, no end state. |
| Backfill debate traces for 40 analyses to quality ≥0.65 | iterative | Bounded; each iteration processes a batch; validator checks mean quality. |
| Port the 15 Biomni use cases | iterative | Bounded at 15; per-use-case criteria; Skeptic validator gate. |
| Daily KG edge-growth monitor | recurring | Never ends. |
| Author governance quest spec | one_shot | Single artifact. |
Guidance for task creators
- If you choose
iterative, write the completion_criteria BEFORE
creating the task — the criteria are the contract the validator
enforces. Vague criteria will lead the validator to either
rubber-stamp (if too lenient) or block forever (if impossible).
- Set
min_passes_in_row >= 2 whenever a single passing verdict
would be insufficient proof (e.g. anything touching real money
markets, KG merges, published artifacts).
- Cap
max_iterations realistically. Default is 20; lower if the
scope is obviously tighter (e.g.
max_iterations=15 for a
15-item port).
- Use
orchestra task promote-to-quest <id> to escalate an
iterative task into a quest when its scope grows mid-flight.
Never use it to skip the validator gate.