Quest: LLM Routing & Provider Management

← All Specs

Quest: LLM Routing & Provider Management

Layer: Forge Priority: P88 Status: active

Vision

SciDEX and Orchestra have access to multiple LLMs with different cost/quality/speed profiles.
This quest implements capability-based routing: models declare their strengths on a 1-10 scale
across 6 dimensions, tasks declare minimum requirements, and Orchestra automatically matches
tasks to capable models.

Model Capabilities (1-10 scale)

Modelcodingsafetyreasoninganalysisspeedinstruction_following
opus1010109310
sonnet898878
haiku5756105
minimax435794
glm334573
codex986567
cline776566
mini_swe654484

Matching Logic

A model can execute a task only if ALL its capability ratings >= the task's requirements.
Tasks without requirements are available to any model (backwards-compatible).

Example: Task with {"coding": 8, "safety": 9} → only opus (10/10) and sonnet (8/9) qualify.
Codex (9/8) fails because safety 8 < 9. MiniMax (4/3) fails on both dimensions.

Architecture

Orchestra get-next (capability-aware)
    |
    +-- SQL fetches top 20 candidates (status, eligibility, project filtering)
    |
    +-- Python-side: _task_matches_model(row, model_name)
    |     +-- Check payload_json["requirements"] → capability matching
    |     +-- Fallback: legacy provider string matching
    |
    +-- First matching candidate is returned to the slot

Task Requirements (stored in payload_json)

{"requirements": {"coding": 8, "safety": 9}}

Set via CLI:

orchestra task create --title "Implement X" --requires '{"coding":8,"safety":9}' --project SciDEX
orchestra task update --id $ID --requires '{"coding":7,"reasoning":6}'

Key Rotation

MiniMax supports key rotation across slots:

  • Even-numbered slots use MINIMAX_API_KEY
  • Odd-numbered slots use MINIMAX_API_KEY_2
  • Keys stored in ~/.env (never in git)

Task Status

☑ frg-lr-01-FILT: Wire provider filtering into Orchestra get-next (P91)
☑ frg-lr-02-SLOT: Pass provider identity from slot to get-next query (P90)
☐ frg-lr-03-AUTO: Auto-tagger CI task — classify tasks by capability requirements (P87)
☑ frg-lr-04-MIGR: Migrate existing tasks from provider='claude' to capability-based (P89)
☑ frg-lr-05-KEYS: Secure key management — remove hardcoded keys, use ~/.env (P93)
☐ frg-lr-06-HIST: Audit git history for leaked keys before open-sourcing (P92)
☑ frg-lr-07-CAPS: Capability registry — MODEL_CAPABILITIES dict in orchestra_cli.py (P84)

Security Requirements

  • API keys MUST NOT appear in any git-tracked file
  • Keys stored in ~/.env with chmod 600
  • Both repos (Orchestra + SciDEX) need history audit before open-sourcing
  • Compromised keys should be rotated after history cleanup

Success Criteria

☑ Orchestra get-next uses capability-based matching
☑ Models only pick up tasks they're capable of executing
☐ Auto-tagger correctly sets requirements for >80% of tasks
☑ No API keys in any git-tracked file in either repo
☐ Git history clean of leaked keys (verified by automated scan)
☑ Key rotation working across MiniMax slots
orchestra models command shows capability matrix
orchestra models --requires checks which models qualify
☑ UI shows capability badges on task and quest detail pages

Work Log

  • 2026-04-03: Implemented full capability-based routing system
- Added MODEL_CAPABILITIES dict (8 models x 6 dimensions)
- Rewrote get-next to use Python-side capability matching (LIMIT 20 + filter)
- Added --requires flag to cmd_create and cmd_update
- Migrated all existing tasks: 25 with keyword-based requirements, rest as provider='any'
- Added capability badges to task detail, quest detail, and resources dashboard
- Added orchestra models command

File: quest_llm_routing_spec.md
Modified: 2026-04-24 07:15
Size: 4.3 KB