**Problem:** SciDEX covers Alzheimer's, Parkinson's, ALS, FTD, and general neurodegeneration, but debates are siloed by disease. No mechanism extracts hypotheses from one disease and proposes analogs in others, missing a major source of novel insight. For example: a validated mitochondrial dysfunction mechanism in ALS may have an untested analog in Alzheimer's involving the same pathway but different upstream triggers.
**Goal:** Mine the hypothesis + debate database for mechanistic patterns that appear in one disease context and generate analogical hypotheses in related disease contexts.
**Implementation:**
1. Cluster high-scoring hypotheses (composite_score >= 0.7) by mechanism_category and target_pathway:
`SELECT mechanism_category, target_pathway, disease, COUNT(*), AVG(composite_score) FROM hypotheses WHERE composite_score >= 0.7 GROUP BY mechanism_category, target_pathway, disease ORDER BY AVG(composite_score) DESC`
2. Identify mechanism-pathway pairs that are well-validated (high debate depth + score) in Disease A but absent or thin in Disease B
3. For each identified cross-disease opportunity, use LLM to generate an analogical hypothesis:
- Input: source disease hypothesis title+description + mechanism + target pathway
- Output: analogical hypothesis for target disease with adapted gene targets and mechanistic rationale
4. Insert new hypotheses: `INSERT INTO hypotheses (title, description, disease, target_pathway, mechanism_category, composite_score, origin_type, hypothesis_type) VALUES (..., ..., , ..., 'cross_disease_analogy', 'cross_disease_analogy')`
5. Generate 20+ analogical hypotheses across the disease matrix
6. Link new hypotheses to source hypotheses via parent_hypothesis_id
**Target:** 20+ new cross-disease analogical hypotheses that can seed new debate sessions.
**Use `scidex.core.database.get_db()` and `llm.py` for hypothesis generation. Log each analogy pair: source_hypothesis_id → generated_hypothesis_id.**