Which metabolic biomarkers can distinguish therapeutic response from disease progression in neurodegeneration trials?¶
Notebook ID: nb-SDA-2026-04-04-gap-debate-20260403-222618-c698b06a · Analysis: SDA-2026-04-04-gap-debate-20260403-222618-c698b06a · Generated: 2026-04-17
Research question¶
The debate discussed various metabolic interventions but lacked clear endpoints for clinical translation. Without validated biomarkers linking metabolic changes to neuronal survival, therapeutic development remains empirical rather than mechanism-guided.
Source: Debate session sess_SDA-2026-04-02-gap-v2-5d0e3052 (Analysis: SDA-2026-04-02-gap-v2-5d0e3052)
Approach¶
This notebook is generated programmatically from real Forge tool calls and SciDEX debate data. Code cells load cached evidence bundles from data/forge_cache/seaad/*.json and query live data from scidex.db. Re-run python3 scripts/regenerate_notebooks.py --analysis SDA-2026-04-04-gap-debate-20260403-222618-c698b06a --force to refresh.
7 hypotheses were generated and debated. The knowledge graph has 15 edges.
Debate Summary¶
Quality score: 0.92 · Rounds: 4 · Personas: Theorist, Skeptic, Domain_Expert, Synthesizer
1. Forge tool provenance¶
import json, sys, sqlite3
from pathlib import Path
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
matplotlib.rcParams['figure.dpi'] = 110
matplotlib.rcParams['figure.facecolor'] = 'white'
REPO = Path('.').resolve()
sys.path.insert(0, str(REPO))
CACHE_SUB = 'seaad'
CACHE = REPO / 'data' / 'forge_cache' / CACHE_SUB
def load(name):
p = CACHE / f'{name}.json'
if p.exists():
return json.loads(p.read_text())
return {}
db_path = Path('/home/ubuntu/scidex/scidex.db')
try:
db = sqlite3.connect(str(db_path))
prov = pd.read_sql_query('''
SELECT skill_id, status, COUNT(*) AS n_calls,
ROUND(AVG(duration_ms),0) AS mean_ms
FROM tool_calls
WHERE created_at >= date('now','-30 days')
GROUP BY skill_id, status
ORDER BY n_calls DESC
''', db)
db.close()
prov['tool'] = prov['skill_id'].str.replace('tool_', '', regex=False)
print(f'{len(prov)} tool-call aggregates (last 30 days):')
prov[['tool','status','n_calls','mean_ms']].head(20)
except Exception as e:
print(f'Provenance unavailable: {e}')
181 tool-call aggregates (last 30 days):
2. Target gene annotations¶
ann_rows = []
for g in ['CHKA', 'CKB']:
mg = load(f'mygene_{g}')
hpa = load(f'hpa_{g}')
if not mg and not hpa:
ann_rows.append({'gene': g, 'name': '—', 'protein_class': '—',
'disease_involvement': '—'})
continue
ann_rows.append({
'gene': g,
'name': (mg.get('name') or '')[:55],
'protein_class': ', '.join((hpa.get('protein_class') or [])[:2])[:55]
if isinstance(hpa.get('protein_class'), list)
else str(hpa.get('protein_class') or '—')[:55],
'disease_involvement': ', '.join((hpa.get('disease_involvement') or [])[:2])[:55]
if isinstance(hpa.get('disease_involvement'), list)
else str(hpa.get('disease_involvement') or '')[:55],
})
pd.DataFrame(ann_rows)
| gene | name | protein_class | disease_involvement | |
|---|---|---|---|---|
| 0 | CHKA | — | — | — |
| 1 | CKB | — | — | — |
3. GO Biological Process enrichment (Enrichr)¶
go_bp = load('enrichr_GO_Biological_Process')
if isinstance(go_bp, list) and go_bp:
go_df = pd.DataFrame(go_bp[:10])[['term','p_value','odds_ratio','genes']]
go_df['p_value'] = go_df['p_value'].apply(lambda p: f'{p:.2e}')
go_df['odds_ratio'] = go_df['odds_ratio'].round(1)
go_df['term'] = go_df['term'].str[:60]
go_df['n_hits'] = go_df['genes'].apply(len)
go_df['genes'] = go_df['genes'].apply(lambda g: ', '.join(g))
go_df[['term','n_hits','p_value','odds_ratio','genes']]
else:
print('No GO:BP enrichment data')
# Visualize top GO BP enrichment
go_bp = load('enrichr_GO_Biological_Process')
if isinstance(go_bp, list) and go_bp:
top = go_bp[:8]
terms = [t['term'][:45] for t in top][::-1]
neglogp = [-np.log10(max(t['p_value'], 1e-300)) for t in top][::-1]
fig, ax = plt.subplots(figsize=(9, 4.5))
ax.barh(terms, neglogp, color='#4fc3f7')
ax.set_xlabel('-log10(p-value)')
ax.set_title('Top GO:BP enrichment (Enrichr)')
ax.grid(axis='x', alpha=0.3)
plt.tight_layout(); plt.show()
else:
print('No GO:BP data to plot')
4. KEGG pathway enrichment¶
kegg = load('enrichr_KEGG_Pathways')
if isinstance(kegg, list) and kegg:
kegg_df = pd.DataFrame(kegg[:10])[['term','p_value','odds_ratio','genes']]
kegg_df['genes'] = kegg_df['genes'].apply(lambda g: ', '.join(g))
kegg_df['p_value'] = kegg_df['p_value'].apply(lambda p: f'{p:.2e}')
kegg_df['odds_ratio'] = kegg_df['odds_ratio'].round(1)
kegg_df
else:
print('No KEGG enrichment data')
No KEGG enrichment data
5. STRING protein interaction network¶
ppi = load('string_network')
if isinstance(ppi, list) and ppi:
ppi_df = pd.DataFrame(ppi).sort_values('score', ascending=False)
display_cols = [c for c in ['protein1','protein2','score','escore','tscore'] if c in ppi_df.columns]
print(f'{len(ppi_df)} STRING edges')
ppi_df[display_cols].head(20)
else:
print('No STRING edges returned')
11 STRING edges
# Network figure
ppi = load('string_network')
if isinstance(ppi, list) and ppi:
import math
nodes = sorted({p for e in ppi for p in (e['protein1'], e['protein2'])})
n = len(nodes)
pos = {n_: (math.cos(2*math.pi*i/n), math.sin(2*math.pi*i/n)) for i, n_ in enumerate(nodes)}
fig, ax = plt.subplots(figsize=(7, 7))
for e in ppi:
x1,y1 = pos[e['protein1']]; x2,y2 = pos[e['protein2']]
ax.plot([x1,x2],[y1,y2], color='#888', alpha=0.3+0.5*e['score'],
linewidth=0.5+2*e['score'])
for name,(x,y) in pos.items():
ax.scatter([x],[y], s=450, color='#ffd54f', edgecolors='#333', zorder=3)
ax.annotate(name, (x,y), ha='center', va='center', fontsize=8, fontweight='bold', zorder=4)
ax.set_aspect('equal'); ax.axis('off')
ax.set_title(f'STRING PPI network ({len(ppi)} edges)')
plt.tight_layout(); plt.show()
else:
print('No STRING data to visualize')
6. Reactome pathway footprint¶
pw_rows = []
for g in ['CHKA', 'CKB']:
pws = load(f'reactome_{g}')
if isinstance(pws, list):
pw_rows.append({'gene': g, 'n_pathways': len(pws),
'top_pathway': (pws[0]['name'] if pws else '—')[:70]})
else:
pw_rows.append({'gene': g, 'n_pathways': 0, 'top_pathway': '—'})
pd.DataFrame(pw_rows).sort_values('n_pathways', ascending=False)
| gene | n_pathways | top_pathway | |
|---|---|---|---|
| 0 | CHKA | 0 | — |
| 1 | CKB | 0 | — |
7. Allen Brain Atlas ISH regional expression¶
ish_rows = []
for g in ['CHKA', 'CKB']:
ish = load(f'allen_ish_{g}')
regions = ish.get('regions') or [] if isinstance(ish, dict) else []
ish_rows.append({
'gene': g,
'n_ish_regions': len(regions),
'top_region': (regions[0].get('structure','') if regions else '—')[:45],
'top_energy': round(regions[0].get('expression_energy',0), 2) if regions else None,
})
pd.DataFrame(ish_rows)
| gene | n_ish_regions | top_region | top_energy | |
|---|---|---|---|---|
| 0 | CHKA | 0 | — | — |
| 1 | CKB | 0 | — | — |
8. Hypothesis ranking (7 hypotheses)¶
hyp_data = [('Ketone Utilization Index as Metabolic Flexibility Bioma', 0.803), ('Creatine Kinase System Capacity as Neural Energy Reserv', 0.716), ('Choline Kinase Activity as Membrane Integrity Response ', 0.688), ('Dynamic Lactate-Pyruvate Ratio as Therapeutic Stratific', 0.688), ('GLUT1-Mediated Glucose Flux Coefficient as Neuroprotect', 0.687), ('Mitochondrial ATP/ADP Carrier Activity as Bioenergetic ', 0.642), ('Purine Salvage Pathway Flux as Neuroprotection Efficacy', 0.565)]
titles = [h[0] for h in hyp_data][::-1]
scores = [h[1] for h in hyp_data][::-1]
fig, ax = plt.subplots(figsize=(10, max(8, len(titles)*0.4)))
colors = ['#ef5350' if s >= 0.6 else '#ffa726' if s >= 0.5 else '#66bb6a' for s in scores]
ax.barh(range(len(titles)), scores, color=colors)
ax.set_yticks(range(len(titles))); ax.set_yticklabels(titles, fontsize=7)
ax.set_xlabel('Composite Score'); ax.set_title('Which metabolic biomarkers can distinguish therapeutic response from disease progression in neurodegeneration trials?')
ax.grid(axis='x', alpha=0.3)
plt.tight_layout(); plt.show()
9. Score dimension heatmap (top 10)¶
labels = ['Ketone Utilization Index as Metabolic Fl', 'Creatine Kinase System Capacity as Neura', 'Choline Kinase Activity as Membrane Inte', 'Dynamic Lactate-Pyruvate Ratio as Therap', 'GLUT1-Mediated Glucose Flux Coefficient ', 'Mitochondrial ATP/ADP Carrier Activity a', 'Purine Salvage Pathway Flux as Neuroprot']
matrix = np.array([[0.85, 0.75, 0.65, 0.7, 0, 0.45, 0.35, 0.6, 0.8], [0.65, 0.5, 0.6, 0.75, 0, 0.4, 0.3, 0.4, 0.9], [0.7, 0.6, 0.5, 0.6, 0, 0.25, 0.2, 0.6, 0.7], [0.8, 0.45, 0.55, 0.65, 0, 0.35, 0.3, 0.5, 0.25], [0.75, 0.3, 0.7, 0.8, 0, 0.5, 0.4, 0.1, 0.15], [0.65, 0.35, 0.6, 0.7, 0, 0.3, 0.25, 0.25, 0.2], [0.75, 0.2, 0.4, 0.5, 0, 0.2, 0.15, 0.3, 0.4]])
dims = ['novelty_score', 'feasibility_score', 'impact_score', 'mechanistic_plausibility_score', 'clinical_relevance_score', 'data_availability_score', 'reproducibility_score', 'druggability_score', 'safety_profile_score']
if matrix.size:
fig, ax = plt.subplots(figsize=(10, 5))
im = ax.imshow(matrix, cmap='RdYlGn', aspect='auto', vmin=0, vmax=1)
ax.set_xticks(range(len(dims)))
ax.set_xticklabels([d.replace('_score','').replace('_',' ').title() for d in dims],
rotation=45, ha='right', fontsize=8)
ax.set_yticks(range(len(labels))); ax.set_yticklabels(labels, fontsize=7)
ax.set_title('Score dimensions — top hypotheses')
plt.colorbar(im, ax=ax, shrink=0.8)
plt.tight_layout(); plt.show()
else:
print('No score data available')
10. PubMed evidence per hypothesis¶
Hypothesis 1: Ketone Utilization Index as Metabolic Flexibility Biomarker¶
Target genes: HMGCS2 · Composite score: 0.803
Therapeutic interventions that enhance neuronal survival improve ketone body utilization capacity, measured through 13C-β-hydroxybutyrate PET imaging. Progressive neurodegeneration shows impaired ketone uptake despite adequate ketone availability, indicating metabolic inflexibility.
hid = 'h-2f3fa14b'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 2: Creatine Kinase System Capacity as Neural Energy Reserve Biomarker¶
Target genes: CKB · Composite score: 0.716
Therapeutic interventions that preserve cognitive function maintain brain creatine kinase system capacity, measured through phosphocreatine recovery kinetics using 31P-MRS. Disease progression shows impaired phosphocreatine regeneration despite stable total creatine levels.
hid = 'h-587ea473'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 3: Choline Kinase Activity as Membrane Integrity Response Indicator¶
Target genes: CHKA · Composite score: 0.688
Neuroprotective therapies preserve neuronal membrane integrity through maintained choline kinase activity and phosphatidylcholine synthesis. Progressive neurodegeneration shows declining choline kinase despite treatment, reflecting ongoing membrane breakdown.
hid = 'h-5b0ebb1f'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 4: Dynamic Lactate-Pyruvate Ratio as Therapeutic Stratification Biomarker¶
Target genes: SLC16A1 · Composite score: 0.688
CSF lactate-to-pyruvate ratios undergo distinct temporal patterns during therapeutic response versus disease progression, with successful interventions showing normalized ratios within 12 weeks, while progressive disease maintains elevated lactate despite treatment.
hid = 'h-ea5794f9'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 5: GLUT1-Mediated Glucose Flux Coefficient as Neuroprotection Indicator¶
Target genes: SLC2A1 · Composite score: 0.687
Therapeutic interventions that preserve neuronal function maintain consistent glucose uptake efficiency measured through dynamic PET-glucose tracers. Progressive neurodegeneration shows declining glucose flux coefficients despite stable blood glucose, indicating compromised blood-brain barrier glucose transport.
hid = 'h-31980740'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 6: Mitochondrial ATP/ADP Carrier Activity as Bioenergetic Recovery Metric¶
Target genes: SLC25A4 · Composite score: 0.642
Successful neuroprotective therapies restore mitochondrial ADP/ATP carrier (AAC3) function, measurable through peripheral blood mitochondrial respiratory assays. Disease progression shows persistent AAC3 dysfunction despite treatment, reflecting ongoing bioenergetic failure.
hid = 'h-f7da6372'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
Hypothesis 7: Purine Salvage Pathway Flux as Neuroprotection Efficacy Marker¶
Target genes: HPRT1 · Composite score: 0.565
Effective neuroprotective therapies maintain efficient purine salvage pathway activity, measured through CSF adenosine/inosine ratios and HPRT1 enzymatic activity. Disease progression shows accumulating purine metabolites indicating impaired salvage despite treatment.
hid = 'h-b2706086'
papers = load(f'pubmed_{hid}')
if isinstance(papers, list) and papers:
lit = pd.DataFrame(papers)
cols = [c for c in ['year','journal','title','pmid'] if c in lit.columns]
if cols:
lit = lit[cols]
lit['title'] = lit['title'].str[:80]
if 'journal' in lit.columns:
lit['journal'] = lit['journal'].str[:30]
lit.sort_values('year', ascending=False, inplace=True)
display_df = lit
else:
display_df = pd.DataFrame(papers[:5])
else:
display_df = pd.DataFrame([{'note':'no PubMed results'}])
display_df
| note | |
|---|---|
| 0 | no PubMed results |
11. Knowledge graph edges (15 total)¶
edge_data = [{'source': 'h-2f3fa14b', 'relation': 'targets', 'target': 'HMGCS2', 'strength': 0.5}, {'source': 'h-587ea473', 'relation': 'targets', 'target': 'CKB', 'strength': 0.5}, {'source': 'h-5b0ebb1f', 'relation': 'targets', 'target': 'CHKA', 'strength': 0.5}, {'source': 'h-31980740', 'relation': 'targets', 'target': 'SLC2A1', 'strength': 0.5}, {'source': 'h-ea5794f9', 'relation': 'targets', 'target': 'SLC16A1', 'strength': 0.5}, {'source': 'h-f7da6372', 'relation': 'targets', 'target': 'SLC25A4', 'strength': 0.5}, {'source': 'h-b2706086', 'relation': 'targets', 'target': 'HPRT1', 'strength': 0.5}, {'source': 'HMGCS2', 'relation': 'associated_with', 'target': 'translational_neuroscience', 'strength': 0.4}, {'source': 'CKB', 'relation': 'associated_with', 'target': 'translational_neuroscience', 'strength': 0.4}, {'source': 'CHKA', 'relation': 'associated_with', 'target': 'translational_neuroscience', 'strength': 0.4}, {'source': 'SLC2A1', 'relation': 'associated_with', 'target': 'translational_neuroscience', 'strength': 0.4}, {'source': 'SLC16A1', 'relation': 'associated_with', 'target': 'translational_neuroscience', 'strength': 0.4}, {'source': 'SLC25A4', 'relation': 'associated_with', 'target': 'translational_neuroscience', 'strength': 0.4}, {'source': 'HPRT1', 'relation': 'associated_with', 'target': 'translational_neuroscience', 'strength': 0.4}, {'source': 'SLC2A1', 'relation': 'co_associated_with', 'target': 'GLUT1', 'strength': 0.3}]
if edge_data:
pd.DataFrame(edge_data).head(25)
else:
print('No KG edge data available')
12. Caveats¶
This notebook uses real Forge tool calls cached from live APIs, but:
- Enrichment is against curated gene-set libraries, not genome-wide screens
- STRING/Reactome/HPA/MyGene reflect curated knowledge
- PubMed literature is search-relevance ranked, not systematic review
The cached evidence bundle is the minimum viable real-data analysis for this topic.