Goal
If >70% of hypotheses for a gap share the same mechanism, trigger exploration of alternatives during next debate cycle. This prevents the system from getting stuck exploring only one mechanistic hypothesis family for a given research gap.
Acceptance Criteria
☑ Function check_mechanism_diversity(db, gap_id) returns (is_monoculture, dominant_mechanism, total_hypotheses, mechanism_distribution)
☑ run_debate in agent.py checks mechanism diversity before generating hypotheses
☑ When monoculture detected (>70% same mechanism), Theorist prompt includes directive to explore alternative mechanisms
☑ diversity_score on knowledge_gaps table is updated after each debate
☐ Unit test verifies the anti-monoculture logic (deferred — would require DB fixture)
Approach
In agent.py, add check_mechanism_diversity(gap_id) method that:
- Queries hypotheses for the gap via analyses join
- Uses
COALESCE(mechanism_category, target_pathway, 'unknown') as mechanism identifier
- Computes percentage distribution
- Returns (is_monoculture, dominant_mechanism, total, distribution_dict)
In run_debate (before Theorist round), call the check and if monoculture:
- Inject into theorist prompt: "IMPORTANT: Previous hypotheses for this gap are predominantly about [{dominant_mechanism}]. Actively explore alternative mechanisms distinct from this dominant one."
In post_process.py, after hypothesis extraction, update diversity_score on the knowledge_gap:
- Set to fraction of hypotheses covering non-dominant mechanisms (0=novelty, 1=diverse)
Implementation Details
Agent.py changes
def check_mechanism_diversity(self, gap_id: str) -> tuple:
"""Check if hypotheses for a gap are dominated by a single mechanism.
Returns: (is_monoculture: bool, dominant_mechanism: str,
total_hypotheses: int, distribution: dict)
"""
rows = self.db.execute("""
SELECT h.mechanism_category, h.target_pathway, COUNT(*) as cnt
FROM hypotheses h
JOIN analyses a ON h.analysis_id = a.id
WHERE a.gap_id = ?
GROUP BY COALESCE(h.mechanism_category, h.target_pathway, 'unknown')
""", (gap_id,)).fetchall()
if not rows:
return False, None, 0, {}
total = sum(r[2] for r in rows)
distribution = {r[0] or r[1] or 'unknown': r[2] for r in rows}
dominant = max(distribution, key=distribution.get)
dominant_pct = distribution[dominant] / total
return dominant_pct > 0.70, dominant, total, distribution
post_process.py changes
After hypotheses are inserted (around line 1425), compute and update diversity_score:
# Update gap diversity score
try:
gap_id = meta.get('gap_id')
if gap_id:
is_mono, dominant, total, dist = check_mechanism_diversity(db, gap_id)
diversity = 1 - (max(dist.values()) / total) if total > 0 else 1.0
db.execute("""
UPDATE knowledge_gaps SET diversity_score = ? WHERE id = ?
""", (diversity, gap_id))
except Exception as e:
print(f" → Diversity score update skipped: {e}")
Dependencies
- None (uses existing hypothesis and analysis tables)
Dependents
- Follow-up task to auto-categorize hypotheses by mechanism using LLM
- Integration with quest tasks that audit gap quality
Work Log
2026-04-17
- Created spec file
- Investigated codebase structure (agent.py, post_process.py, DB schema)
- Found mechanism_category field exists in hypotheses but is not populated
- Will use COALESCE(mechanism_category, target_pathway, 'unknown') as mechanism proxy
- Implemented check_mechanism_diversity() in agent.py (lines 831-860)
- Implemented anti-monoculture directive injection in run_debate() (lines 1255-1278)
- Implemented diversity_score update in post_process.py after hypothesis loop (lines 1750-1786)
- Syntax verified: agent.py and post_process.py both compile cleanly
- Committed and pushed to orchestra/task/t-antimo-anti-monoculture-enforcement
- Result: Done — anti-monoculture enforcement implemented