Neuroinflammation and microglial priming in early AD¶
Notebook ID: nb-SDA-2026-04-04-gap-neuroinflammation-microglial-20260404 · Analysis: SDA-2026-04-04-gap-neuroinflammation-microglial-20260404 · Generated: 2026-04-10
Research question¶
How does microglial priming contribute to early Alzheimer's disease pathology? Focus on the mechanisms by which peripheral inflammation, aging, and genetic risk factors (e.g., APOE4, TREM2) prime microglia toward an inflammatory phenotype. Investigate the role of cytokines, damage-associated molecular patterns (DAMPs), and metabolic shifts in microglial activation states during the prodromal phase of AD.
Approach¶
This notebook is generated programmatically from real Forge tool calls and SciDEX debate data. Code cells load cached evidence bundles from data/forge_cache/seaad/*.json and query live data from scidex.db. Re-run python3 scripts/regenerate_notebooks.py --analysis SDA-2026-04-04-gap-neuroinflammation-microglial-20260404 --force to refresh.
7 hypotheses were generated and debated. The knowledge graph has 0 edges.
Debate Summary¶
Quality score: 0.63 · Rounds: 4 · Personas: Theorist, Skeptic, Domain_Expert, Synthesizer
1. Forge tool provenance¶
import json, sys, sqlite3
from pathlib import Path
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
matplotlib.rcParams['figure.dpi'] = 110
matplotlib.rcParams['figure.facecolor'] = 'white'
REPO = Path('.').resolve()
sys.path.insert(0, str(REPO))
CACHE_SUB = 'seaad'
CACHE = REPO / 'data' / 'forge_cache' / CACHE_SUB
def load(name):
p = CACHE / f'{name}.json'
if p.exists():
return json.loads(p.read_text())
return {}
db_path = Path('/home/ubuntu/scidex/scidex.db')
try:
db = sqlite3.connect(str(db_path))
prov = pd.read_sql_query('''
SELECT skill_id, status, COUNT(*) AS n_calls,
ROUND(AVG(duration_ms),0) AS mean_ms
FROM tool_calls
WHERE created_at >= date('now','-30 days')
GROUP BY skill_id, status
ORDER BY n_calls DESC
''', db)
db.close()
prov['tool'] = prov['skill_id'].str.replace('tool_', '', regex=False)
print(f'{len(prov)} tool-call aggregates (last 30 days):')
prov[['tool','status','n_calls','mean_ms']].head(20)
except Exception as e:
print(f'Provenance unavailable: {e}')
77 tool-call aggregates (last 30 days):
2. Target gene annotations¶
ann_rows = []
for g in ['APOE', 'CASCADE', 'CHANNELS', 'CLOCK', 'COMPLEMENT', 'COMPONENTS', 'CONSTRUCTS', 'GFAP', 'ION', 'LDLR', 'OPTOGENETIC', 'PROTEINS', 'SPIB']:
mg = load(f'mygene_{g}')
hpa = load(f'hpa_{g}')
if not mg and not hpa:
ann_rows.append({'gene': g, 'name': '—', 'protein_class': '—',
'disease_involvement': '—'})
continue
ann_rows.append({
'gene': g,
'name': (mg.get('name') or '')[:55],
'protein_class': ', '.join((hpa.get('protein_class') or [])[:2])[:55]
if isinstance(hpa.get('protein_class'), list)
else str(hpa.get('protein_class') or '—')[:55],
'disease_involvement': ', '.join((hpa.get('disease_involvement') or [])[:2])[:55]
if isinstance(hpa.get('disease_involvement'), list)
else str(hpa.get('disease_involvement') or '')[:55],
})
pd.DataFrame(ann_rows)
| gene | name | protein_class | disease_involvement | |
|---|---|---|---|---|
| 0 | APOE | apolipoprotein E | Cancer-related genes, Candidate cardiovascular... | Alzheimer disease, Amyloidosis |
| 1 | CASCADE | — | — | — |
| 2 | CHANNELS | — | — | — |
| 3 | CLOCK | — | — | — |
| 4 | COMPLEMENT | — | — | — |
| 5 | COMPONENTS | — | — | — |
| 6 | CONSTRUCTS | — | — | — |
| 7 | GFAP | glial fibrillary acidic protein | Candidate cardiovascular disease genes, Diseas... | Disease variant, Leukodystrophy |
| 8 | ION | — | — | — |
| 9 | LDLR | — | — | — |
| 10 | OPTOGENETIC | — | — | — |
| 11 | PROTEINS | — | — | — |
| 12 | SPIB | — | — | — |
3. GO Biological Process enrichment (Enrichr)¶
go_bp = load('enrichr_GO_Biological_Process')
if isinstance(go_bp, list) and go_bp:
go_df = pd.DataFrame(go_bp[:10])[['term','p_value','odds_ratio','genes']]
go_df['p_value'] = go_df['p_value'].apply(lambda p: f'{p:.2e}')
go_df['odds_ratio'] = go_df['odds_ratio'].round(1)
go_df['term'] = go_df['term'].str[:60]
go_df['n_hits'] = go_df['genes'].apply(len)
go_df['genes'] = go_df['genes'].apply(lambda g: ', '.join(g))
go_df[['term','n_hits','p_value','odds_ratio','genes']]
else:
print('No GO:BP enrichment data')
# Visualize top GO BP enrichment
go_bp = load('enrichr_GO_Biological_Process')
if isinstance(go_bp, list) and go_bp:
top = go_bp[:8]
terms = [t['term'][:45] for t in top][::-1]
neglogp = [-np.log10(max(t['p_value'], 1e-300)) for t in top][::-1]
fig, ax = plt.subplots(figsize=(9, 4.5))
ax.barh(terms, neglogp, color='#4fc3f7')
ax.set_xlabel('-log10(p-value)')
ax.set_title('Top GO:BP enrichment (Enrichr)')
ax.grid(axis='x', alpha=0.3)
plt.tight_layout(); plt.show()
else:
print('No GO:BP data to plot')
4. KEGG pathway enrichment¶
kegg = load('enrichr_KEGG_Pathways')
if isinstance(kegg, list) and kegg:
kegg_df = pd.DataFrame(kegg[:10])[['term','p_value','odds_ratio','genes']]
kegg_df['genes'] = kegg_df['genes'].apply(lambda g: ', '.join(g))
kegg_df['p_value'] = kegg_df['p_value'].apply(lambda p: f'{p:.2e}')
kegg_df['odds_ratio'] = kegg_df['odds_ratio'].round(1)
kegg_df
else:
print('No KEGG enrichment data')
No KEGG enrichment data
5. STRING protein interaction network¶
ppi = load('string_network')
if isinstance(ppi, list) and ppi:
ppi_df = pd.DataFrame(ppi).sort_values('score', ascending=False)
display_cols = [c for c in ['protein1','protein2','score','escore','tscore'] if c in ppi_df.columns]
print(f'{len(ppi_df)} STRING edges')
ppi_df[display_cols].head(20)
else:
print('No STRING edges returned')
11 STRING edges
# Network figure
ppi = load('string_network')
if isinstance(ppi, list) and ppi:
import math
nodes = sorted({p for e in ppi for p in (e['protein1'], e['protein2'])})
n = len(nodes)
pos = {n_: (math.cos(2*math.pi*i/n), math.sin(2*math.pi*i/n)) for i, n_ in enumerate(nodes)}
fig, ax = plt.subplots(figsize=(7, 7))
for e in ppi:
x1,y1 = pos[e['protein1']]; x2,y2 = pos[e['protein2']]
ax.plot([x1,x2],[y1,y2], color='#888', alpha=0.3+0.5*e['score'],
linewidth=0.5+2*e['score'])
for name,(x,y) in pos.items():
ax.scatter([x],[y], s=450, color='#ffd54f', edgecolors='#333', zorder=3)
ax.annotate(name, (x,y), ha='center', va='center', fontsize=8, fontweight='bold', zorder=4)
ax.set_aspect('equal'); ax.axis('off')
ax.set_title(f'STRING PPI network ({len(ppi)} edges)')
plt.tight_layout(); plt.show()
else:
print('No STRING data to visualize')
6. Reactome pathway footprint¶
pw_rows = []
for g in ['APOE', 'CASCADE', 'CHANNELS', 'CLOCK', 'COMPLEMENT', 'COMPONENTS', 'CONSTRUCTS', 'GFAP', 'ION', 'LDLR', 'OPTOGENETIC', 'PROTEINS', 'SPIB']:
pws = load(f'reactome_{g}')
if isinstance(pws, list):
pw_rows.append({'gene': g, 'n_pathways': len(pws),
'top_pathway': (pws[0]['name'] if pws else '—')[:70]})
else:
pw_rows.append({'gene': g, 'n_pathways': 0, 'top_pathway': '—'})
pd.DataFrame(pw_rows).sort_values('n_pathways', ascending=False)
| gene | n_pathways | top_pathway | |
|---|---|---|---|
| 0 | APOE | 8 | Nuclear signaling by ERBB4 |
| 7 | GFAP | 2 | Nuclear signaling by ERBB4 |
| 1 | CASCADE | 0 | — |
| 3 | CLOCK | 0 | — |
| 2 | CHANNELS | 0 | — |
| 4 | COMPLEMENT | 0 | — |
| 5 | COMPONENTS | 0 | — |
| 6 | CONSTRUCTS | 0 | — |
| 8 | ION | 0 | — |
| 9 | LDLR | 0 | — |
| 10 | OPTOGENETIC | 0 | — |
| 11 | PROTEINS | 0 | — |
| 12 | SPIB | 0 | — |
7. Allen Brain Atlas ISH regional expression¶
ish_rows = []
for g in ['APOE', 'CASCADE', 'CHANNELS', 'CLOCK', 'COMPLEMENT', 'COMPONENTS', 'CONSTRUCTS', 'GFAP', 'ION', 'LDLR', 'OPTOGENETIC', 'PROTEINS', 'SPIB']:
ish = load(f'allen_ish_{g}')
regions = ish.get('regions') or [] if isinstance(ish, dict) else []
ish_rows.append({
'gene': g,
'n_ish_regions': len(regions),
'top_region': (regions[0].get('structure','') if regions else '—')[:45],
'top_energy': round(regions[0].get('expression_energy',0), 2) if regions else None,
})
pd.DataFrame(ish_rows)
| gene | n_ish_regions | top_region | top_energy | |
|---|---|---|---|---|
| 0 | APOE | 0 | — | — |
| 1 | CASCADE | 0 | — | — |
| 2 | CHANNELS | 0 | — | — |
| 3 | CLOCK | 0 | — | — |
| 4 | COMPLEMENT | 0 | — | — |
| 5 | COMPONENTS | 0 | — | — |
| 6 | CONSTRUCTS | 0 | — | — |
| 7 | GFAP | 0 | — | — |
| 8 | ION | 0 | — | — |
| 9 | LDLR | 0 | — | — |
| 10 | OPTOGENETIC | 0 | — | — |
| 11 | PROTEINS | 0 | — | — |
| 12 | SPIB | 0 | — | — |
8. Hypothesis ranking (7 hypotheses)¶
hyp_data = [('Astrocyte-Mediated Microglial Memory Erasure', 0.579), ('APOE4-Specific Microglial Metabolic Rescue', 0.56), ('Gut-Brain Axis M-Cell Modulation', 0.547), ('Circadian-Metabolic Microglial Reprogramming', 0.547), ('DAMP-Scavenging Microglial Reset', 0.547), ('Peripheral-Central Immune Decoupling Therapy', 0.547), ('Temporal Microglial State Switching', 0.547)]
titles = [h[0] for h in hyp_data][::-1]
scores = [h[1] for h in hyp_data][::-1]
fig, ax = plt.subplots(figsize=(10, max(8, len(titles)*0.4)))
colors = ['#ef5350' if s >= 0.6 else '#ffa726' if s >= 0.5 else '#66bb6a' for s in scores]
ax.barh(range(len(titles)), scores, color=colors)
ax.set_yticks(range(len(titles))); ax.set_yticklabels(titles, fontsize=7)
ax.set_xlabel('Composite Score'); ax.set_title('Neuroinflammation and microglial priming in early AD')
ax.grid(axis='x', alpha=0.3)
plt.tight_layout(); plt.show()
9. Score dimension heatmap (top 10)¶
labels = ['Astrocyte-Mediated Microglial Memory Era', 'APOE4-Specific Microglial Metabolic Resc', 'Gut-Brain Axis M-Cell Modulation', 'Circadian-Metabolic Microglial Reprogram', 'DAMP-Scavenging Microglial Reset', 'Peripheral-Central Immune Decoupling The', 'Temporal Microglial State Switching']
matrix = np.array([[0.7, 0.65, 0.7, 0.65, 0, 0.65, 0.65, 0.65, 0.65], [0.65, 0.8, 0.85, 0.75, 0, 0.75, 0.7, 0.9, 0.7], [0.85, 0.4, 0.7, 0.65, 0, 0.4, 0.45, 0.3, 0.45], [0.7, 0.65, 0.7, 0.65, 0, 0.65, 0.65, 0.65, 0.65], [0.7, 0.65, 0.7, 0.65, 0, 0.65, 0.65, 0.65, 0.65], [0.7, 0.65, 0.7, 0.65, 0, 0.65, 0.65, 0.65, 0.65], [0.7, 0.65, 0.7, 0.65, 0, 0.65, 0.65, 0.65, 0.65]])
dims = ['novelty_score', 'feasibility_score', 'impact_score', 'mechanistic_plausibility_score', 'clinical_relevance_score', 'data_availability_score', 'reproducibility_score', 'druggability_score', 'safety_profile_score']
if matrix.size:
fig, ax = plt.subplots(figsize=(10, 5))
im = ax.imshow(matrix, cmap='RdYlGn', aspect='auto', vmin=0, vmax=1)
ax.set_xticks(range(len(dims)))
ax.set_xticklabels([d.replace('_score','').replace('_',' ').title() for d in dims],
rotation=45, ha='right', fontsize=8)
ax.set_yticks(range(len(labels))); ax.set_yticklabels(labels, fontsize=7)
ax.set_title('Score dimensions — top hypotheses')
plt.colorbar(im, ax=ax, shrink=0.8)
plt.tight_layout(); plt.show()
else:
print('No score data available')
10. PubMed evidence per hypothesis¶
Hypothesis 1: Astrocyte-Mediated Microglial Memory Erasure¶
Target genes: GFAP, S100B · Composite score: 0.579
Astrocyte-Mediated Microglial Memory Erasure¶
Mechanistic Hypothesis Overview¶
This hypothesis proposes a disease-modifying strategy centered on Astrocyte-Mediated Microglial Memory Erasure as a mechanistic intervention point in neurodegeneration. The core claim is that the biological process represented by astrocyte-mediated microglial memory erasure is not a passive disease byproduct, but a functional bottleneck that shapes how quickly neurons lose homeostasis under chronic stress.
hid = 'h-3bfa414a'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 2: APOE4-Specific Microglial Metabolic Rescue¶
Target genes: APOE, ABCA1, LDLR · Composite score: 0.56
APOE4-Specific Microglial Metabolic Rescue Hypothesis¶
Molecular Mechanism and Rationale¶
The APOE4-specific microglial metabolic rescue hypothesis centers on the fundamental disruption of lipid metabolism and cholesterol homeostasis in microglia carrying the APOE4 allele. Unlike APOE2 and APOE3, the APOE4 isoform exhibits reduced binding affinity to the low-density lipoprotein receptor (LDLR) and altered interactions with ATP-binding cassette transporter A1 (ABCA1). This molecular dysfunct
hid = 'h-64d92165'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 3: Gut-Brain Axis M-Cell Modulation¶
Target genes: GP2, SPIB · Composite score: 0.547
Molecular Mechanism and Rationale¶
Microfold (M) cells in Peyer's patches serve as specialized antigen-sampling cells that transport luminal antigens and bacterial products across the intestinal epithelial barrier through transcytosis mechanisms regulated by glycoprotein 2 (GP2) and Spi-B transcription factor (SPIB). GP2 functions as a receptor for bacterial adhesion and uptake, particularly recognizing type 1 pili from pathogenic bacteria, while SPIB acts as the master transcriptional regula
hid = 'h-40ad6ac6'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 4: Circadian-Metabolic Microglial Reprogramming¶
Target genes: CLOCK, BMAL1, PER2 · Composite score: 0.547
Circadian-Metabolic Microglial Reprogramming¶
Molecular Mechanism and Rationale¶
The circadian-metabolic microglial reprogramming hypothesis centers on the intricate relationship between circadian clock machinery and microglial metabolic states in neurodegeneration. The core molecular clock components CLOCK, BMAL1, and PER2 orchestrate not only temporal gene expression but also fundamental metabolic processes within microglia. Under homeostatic conditions, CLOCK-BMAL1 heterodimers activate
hid = 'h-de52344d'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 5: DAMP-Scavenging Microglial Reset¶
Target genes: HMGB1, S100 proteins · Composite score: 0.547
DAMP-Scavenging Microglial Reset¶
Mechanistic Hypothesis Overview¶
The "DAMP-Scavenging Microglial Reset" hypothesis proposes that Alzheimer's disease is driven in part by the accumulation of damage-associated molecular patterns (DAMPs) — including extracellular ATP, HMGB1, S100A8/A9 (calprotectin), urate crystals, and oxidized lipds — that chronically activate the NLRP3 inflammasome and NF-κB pathway in microglia, and that enhancing microglial DAMP-scavenging capacity can reset the neuro
hid = 'h-03e31e80'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 6: Peripheral-Central Immune Decoupling Therapy¶
Target genes: TREM2, complement cascade components · Composite score: 0.547
Peripheral-Central Immune Decoupling Therapy¶
Mechanistic Hypothesis Overview¶
The "Peripheral-Central Immune Decoupling Therapy" hypothesis proposes that the immune system outside the CNS (peripheral immunity) contributes to Alzheimer's disease pathology through trafficking of activated immune cells into the brain and through systemic cytokine signaling that activates CNS microglia, and that decoupling peripheral immunity from CNS inflammation represents a viable therapeutic strategy. Th
hid = 'h-7f2d0e21'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 7: Temporal Microglial State Switching¶
Target genes: Optogenetic constructs, ion channels · Composite score: 0.547
Temporal Microglial State Switching¶
Mechanistic Hypothesis Overview¶
The "Temporal Microglial State Switching" hypothesis proposes that microglia exist in multiple discrete activation states (beyond the simple M1/M2 dichotomy) and that the progression from homeostatic surveillance to disease-associated microglia (DAM) represents a therapeutic opportunity — specifically, that pharmacological manipulation of the molecular switches governing microglial state transitions can restore the home
hid = 'h-21cd4ba1'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
11. Knowledge graph edges (0 total)¶
edge_data = []
if edge_data:
pd.DataFrame(edge_data).head(25)
else:
print('No KG edge data available')
No KG edge data available
12. Caveats¶
This notebook uses real Forge tool calls cached from live APIs, but:
- Enrichment is against curated gene-set libraries, not genome-wide screens
- STRING/Reactome/HPA/MyGene reflect curated knowledge
- PubMed literature is search-relevance ranked, not systematic review
The cached evidence bundle is the minimum viable real-data analysis for this topic.