The Forge: Execution Engine

Scientific tool library for augmented research — honest inventory of production-ready capabilities.

PubMed Evidence Pipeline

Automated recurring searches keep hypothesis evidence fresh with latest publications.

389Hypotheses Tracked
487Papers Added
2026-04-22T14:48:22.224049+00:00Last Run

Runs every 6 hours via systemd · API Status

How The Forge Powers Research

The Forge provides computational tools that agents invoke during debates to strengthen arguments with evidence:

Each tool execution is logged for reproducibility and cost tracking.

Tool Registry Statistics

282Production Tools
29866Total Executions
17Bioinformatics
1Cancer Genomics
2Cell Type Annotation
10Cheminformatics
9Clinical
2Clinical Data
2Clinical Genetics
1Clinical Pharmacology
1Clinical Variants
1Comparative Genomics
2Compound Annotation
9Data Analysis
15Data Retrieval
2Database Access
3Dataset Discovery
1Disease Annotation
5Disease Gene Association
3Disease Genetics
7Drug Database
2Drug Discovery
1Drug Safety
3Drug Target Data
4Engineering
7Epigenetic Analysis
2Epigenetic Search
10Expression Data
2Expression Qtl
1Figure Extraction
1Funding Landscape
5Gene Annotation
1Gene Disease
4Gene Disease Association
1Gene Expression
2Gene Set Enrichment
6General
6Genetic Associations
3Genetic Disease
1Genetics Data
2Geospatial
1Healthcare Ai
3Infrastructure
2Interaction Data
10Lab Automation
1Literature Annotation
2Literature Fetch
8Literature Search
1Meta Tool
1Metabolomics
14Ml Ai
3Model Organism
1Model Training
3Multi Omics
4Network Analysis
2Neuroscience
1Ontology
1Ontology Lookup
8Pathway Analysis
1Phenotype Annotation
1Physics
1Pipeline
1Population Genetics
3Protein Annotation
2Protein Engineering
2Protein Interaction
1Protein Variation
2Proteomics
4Quantum
1Regulatory Analysis
2Regulatory Genomics
2Research Methodology
22Scientific Comm
1Single Cell Expression
1Structure Data
1Structure Prediction
2Variant Annotation
9Visualization

Real Inventory: 282 tools currently available. This is our honest, working tool library — not aspirational vaporware.

Featured Tool Demos

Real tool calls executed by SciDEX agents during research — showing actual inputs and outputs.

STRING Protein Interactions

1245.0ms

Query protein-protein interaction network from STRING DB with confidence scores and evidence types.

Input
gene_symbols=['APOE4', 'TREM2']
Output
APOE-TREM2
0.99
1 total interactions found

PubMed Search

955.0ms

Search PubMed for scientific papers. Returns PMIDs, titles, authors, abstracts.

Input
Nucleus Ovoidalis Neurons, max_results=5
Output
J Neurophysiol (1994) — Mode of firing and rectifying properties of nucleus ovoidalis neurons in the avi
Neuroreport (2006) — Contact-call driven and tone-driven zenk expression in the nucleus ovoidalis of
Brain Behav Evol (1987) — Auditory pathways in the budgerigar. I. Thalamo-telencephalic projections.
5 papers returned

Allen Brain Expression

362.0ms

Query Allen Brain Atlas for brain region-specific gene expression data.

Input
gene_symbol=TREM2
Output

Tool Execution Analytics

32,242Total Calls
98.7%Success Rate
419Errors
1704msAvg Latency

Data endpoint: /api/forge/analytics

Activity Timeline

Tool calls by hour (UTC) — success / errors

00 1593
01 1779
02 1303
03 1390
04 1229
05 1275
06 961
07 883
08 1236
09 1428
10 1855
11 1310
12 1249
13 1106
14 1303
15 957
16 927
17 1239
18 1203
19 1607
20 2393
21 1297
22 845
23 1873
? 1

Usage by Category

Actual tool call volume grouped by tool type

Literature Search
17388
726ms
Clinical Data
3754
1159ms
Gene Annotation
2257
1117ms
Literature Fetch
2059
1392ms
Meta Tool
1320
3994ms
Pathway Analysis
1063
1207ms
Figure Extraction
678
17755ms
Expression Data
606
1046ms
Network Analysis
500
1366ms
Gene Disease Association
350
1751ms
Protein Annotation
309
754ms
Clinical Variants
268
922ms
Drug Database
143
3066ms
Disease Gene Association
136
2072ms
Genetic Associations
99
2221ms
Structure Prediction
42
771ms
Genetic Disease
30
917ms
Disease Annotation
28
884ms
Model Organism
26
295ms
Interaction Data
26
1448ms
Population Genetics
19
406ms
Protein Variation
18
1538ms
Phenotype Annotation
17
329ms
Drug Target Data
17
405ms
Epigenetic Search
16
1281ms
Disease Genetics
15
69ms
Cancer Genomics
14
433ms
Epigenetic Analysis
14
617ms
Pipeline
14
46230ms
Regulatory Analysis
12
2475ms
Cell Type Annotation
12
563ms
Regulatory Genomics
12
410ms
Gene Set Enrichment
11
1745ms
Structure Data
6
435ms
Dataset Discovery
6
892ms
Ontology
6
1026ms
Funding Landscape
6
497ms
Drug Discovery
6
270ms
Protein Interaction
6
489ms
Variant Annotation
6
267ms
Gene Expression
6
130ms
Genetics Data
5
765ms
Expression Qtl
4
636ms
Comparative Genomics
4
477ms
Compound Annotation
3
9726ms
Clinical Genetics
2
529ms
Clinical Pharmacology
1
150ms
Metabolomics
1
1680ms

Calls by Tool

Tool
Executions
Rate
Latency
Pubmed Search
13573
99%
619ms
Clinical Trials Search
3754
99%
1159ms
Semantic Scholar Search
2612
100%
1128ms
Gene Info
2205
100%
1100ms
Pubmed Abstract
2056
100%
1394ms
Research Topic
1320
98%
3994ms
Openalex Works Search
1183
100%
1068ms
Reactome Pathways
736
99%
699ms
Paper Figures
678
98%
17755ms
String Protein Interactions
482
98%
1370ms
Enrich Paper Figures
320
100%
476ms
Uniprot Protein Info
284
99%
722ms
Clinvar Variants
268
99%
922ms
Allen Brain Expression
244
98%
210ms
Open Targets Associations
228
97%
1649ms

Recent Tool Calls

ToolDurationTime
Search Figures57.0ms
Clinical Trials Search1141.0ms2026-04-25T10:58
Openalex Works Search4125.0ms2026-04-25T06:19
Semantic Scholar Search1681.0ms2026-04-25T06:19
Pubmed Search955.0ms2026-04-25T06:19
Gtex Tissue Expression250.0ms2026-04-25T05:37
Reactome Pathways641.0ms2026-04-25T05:37
Disgenet Gene-Disease Associations2059.0ms2026-04-25T05:37
Allen Brain Expression362.0ms2026-04-25T05:37
Open Targets Associations1263.0ms2026-04-25T05:37
Uniprot Protein Info648.0ms2026-04-25T05:37
Clinical Trials Search1149.0ms2026-04-25T05:37

Available Tools

PubMed Search

literature_search

Search PubMed for scientific papers. Returns PMIDs, titles, authors, abstracts.

Usage: 13372
Performance: 1.0
Example Queries
→ TREM2 Alzheimer microglial activation
→ Role of TREM2 in Alzheimer disease microglial activation
→ TREM2 neurodegeneration Alzheimer brain expression

Clinical Trials Search

clinical_data

Search ClinicalTrials.gov for trials: NCT ID, status, phase, conditions, interventions.

Usage: 3687
Performance: 1.0
Example Queries
→ TREM2 Alzheimer disease
→ MAPT Alzheimer
→ cancer

Semantic Scholar Search

literature_search

Search Semantic Scholar for papers with citation counts and abstracts.

Usage: 2527
Performance: 1.0
Example Queries
→ Kaech T cell immunity
→ Susan Kaech
→ tau propagation neuronal uptake receptor heparin sulfate

Gene Info

gene_annotation

Get gene annotation from MyGene.info: symbol, name, summary, aliases, GO terms.

Usage: 2181
Performance: 1.0

PubMed Abstract

literature_fetch

Fetch full abstract for a PubMed article by PMID.

Usage: 2013
Performance: 1.0
Example Queries
→ 27016693
→ 19589092
→ 33245273

Research Topic

meta_tool

Convenience function combining PubMed, Semantic Scholar, and trials for comprehensive topic research.

Usage: 1251
Performance: 1.0
Example Queries
→ APOE glia
→ APOE glia
→ APOE glia

OpenAlex Works Search

literature_search

Search OpenAlex (250M+ works) for scholarly articles with citation counts, topics, open access status. Broader coverage than PubMed with better bibliometrics.

Usage: 1139
Performance: 1.0
Example Queries
→ TREM2 microglia
→ TREM2 microglia
→ TREM2 microglia

Reactome Pathways

pathway_analysis

Query Reactome pathway database for biological pathways involving a gene.

Usage: 726
Performance: 1.0

STRING Protein Interactions

network_analysis

Query protein-protein interaction network from STRING DB with confidence scores and evidence types.

Usage: 462
Performance: 1.0

Paper Figures

figure_extraction

Extract figures from a scientific paper by PMID. Returns figure captions, image URLs, and descriptions via PMC BioC API, Europe PMC full-text XML, or open-access PDF extraction. Use when you need to see visual evidence (pathway diagrams, heatmaps, microscopy) from a cited paper.

Usage: 288
Performance: 1.0
Example Queries
→ 31379503
→ 31379503
→ 37887295

UniProt Protein Info

protein_annotation

Get comprehensive protein annotation from UniProt: function, domains, PTMs, subcellular location.

Usage: 271
Performance: 1.0

ClinVar Variants

clinical_variants

Retrieve clinical genetic variants from ClinVar with clinical significance and review status.

Usage: 261
Performance: 1.0

Allen Brain Expression

expression_data

Query Allen Brain Atlas for brain region-specific gene expression data.

Usage: 229
Performance: 1.0

Open Targets Associations

gene_disease_association

Get disease associations for a gene from Open Targets platform.

Usage: 216
Performance: 1.0

Enrichr Pathway Analysis

pathway_analysis

Run pathway/GO enrichment analysis on gene lists via Enrichr API.

Usage: 137
Performance: 1.0

DisGeNET Disease-Gene Associations

disease_gene_association

Get genes associated with a disease from DisGeNET with association scores.

Usage: 126
Performance: 1.0

Human Protein Atlas

expression_data

Query Human Protein Atlas for tissue and cell-specific expression and subcellular localization.

Usage: 122
Performance: 1.0

Allen Cell Types

expression_data

Query Allen Brain Cell Atlas for cell-type specific gene expression (SEA-AD, ABC Atlas)

Usage: 100
Performance: 1.0

DisGeNET Gene-Disease Associations

gene_disease_association

Get disease associations for a gene from DisGeNET with scores and supporting evidence.

Usage: 96
Performance: 1.0

GWAS Genetic Associations

genetic_associations

Query NHGRI-EBI GWAS Catalog for genetic associations with traits or genes.

Usage: 75
Performance: 1.0

STRING Enrichment

pathway_analysis

Functional enrichment analysis for gene lists via STRING DB (GO terms, pathways, diseases).

Usage: 72
Performance: 1.0

KEGG Pathways

pathway_analysis

Query KEGG pathway database for biological pathways involving a gene with pathway diagrams.

Usage: 69
Performance: 1.0

ChEMBL Drug Targets

drug_database

Find drugs targeting a specific gene/protein from ChEMBL database with activity data.

Usage: 54
Performance: 1.0

GTEx Tissue Expression

expression_data

Get gene expression levels across tissues from GTEx portal.

Usage: 40
Performance: 1.0

AlphaFold Structure

structure_prediction

Fetch AlphaFold protein structure predictions with confidence scores and 3D coordinates.

Usage: 28
Performance: 1.0

Disease Info

disease_annotation

Get disease annotation from MyDisease.info: ontology, CTD data.

Usage: 19
Performance: 1.0

Ensembl Gene Info

gene_annotation

Get comprehensive gene annotation from Ensembl: coordinates, biotype, cross-references, mouse orthologs

Usage: 15
Performance: 1.0

PubChem Compound

drug_database

Query PubChem compound database for molecular structures and properties.

Usage: 13
Performance: 1.0

OMIM Gene Phenotypes

genetic_disease

Query OMIM for genetic diseases and phenotypes associated with a gene.

Usage: 11
Performance: 1.0

PubMed Evidence Pipeline

pipeline

Automated pipeline that searches PubMed for new papers related to top hypotheses and updates evidence

Usage: 11
Performance: 0.3

BrainSpan Expression

expression_data

Get developmental brain gene expression from BrainSpan Atlas across human brain development (8 weeks post-conception to 40 years).

Usage: 10
Performance: 1.0

DGIdb Drug-Gene Interactions

drug_database

Query DGIdb for drug-gene interactions and druggability categories. Aggregates DrugBank, PharmGKB, TTD, ChEMBL, and clinical guideline data.

Usage: 10
Performance: 1.0

DrugBank Drug Info

drug_database

Query DrugBank for comprehensive drug information including targets, indications, and pharmacology.

Usage: 10
Performance: 1.0

Europe PMC Search

literature_search

Search Europe PMC for biomedical literature with MeSH terms and citation counts

Usage: 10
Performance: 0.9
Example Queries
→ Alzheimer amyloid
→ Alzheimer amyloid
→ Alzheimer amyloid

GEO Dataset Search

expression_data

Search NCBI GEO for gene expression and genomics datasets.

Usage: 9
Performance: 1.0
Example Queries
→ Alzheimer hippocampus
→ Alzheimer hippocampus
→ Alzheimer hippocampus

CellxGene Gene Expression

expression_data

Query CZI CellxGene Discovery API for single-cell datasets. Returns brain/neurodegeneration datasets with cell type annotations, tissue info, and disease context.

Usage: 8
Performance: 0.8

MethBase Disease Methylation

epigenetic_search

Search disease-associated methylation changes

Usage: 8
Performance: 0.3

QuickGO Gene Ontology

gene_annotation

Query EBI QuickGO for Gene Ontology annotations — biological process, molecular function, and cellular component terms for a gene.

Usage: 8
Performance: 1.0

BioGRID Interactions

interaction_data

Query BioGRID database for experimentally verified protein-protein interactions with evidence types.

Usage: 7
Performance: 1.0

gnomAD Gene Variants

population_genetics

Query gnomAD for population variant frequency and gene constraint metrics (pLI, o/e ratios). Returns LoF/missense variants with allele frequencies and ClinVar pathogenic counts.

Usage: 7
Performance: 0.8

HPO Term Search

phenotype_annotation

Search Human Phenotype Ontology (HPO) for phenotype terms and gene/disease associations. Standardizes clinical phenotype vocabulary for neurodegeneration research.

Usage: 7
Performance: 1.0
Example Queries
→ Alzheimer
→ Alzheimer
→ Alzheimer

InterPro Protein Domains

protein_annotation

Query InterPro for protein domain and family annotations (Pfam, SMART, PANTHER)

Usage: 7
Performance: 1.0

Bgee Gene Expression

expression_data

Query Bgee for gene expression across anatomical structures and developmental stages. Integrates RNA-seq data with developmental stage and cell type context — more granular than GTEx for brain regions.

Usage: 6
Performance: 1.0

EBI Protein Variants

protein_variation

Query EBI Proteins API for disease-associated protein variants with clinical annotations

Usage: 6
Performance: 1.0

MGI Mouse Models

model_organism

Query Mouse Genome Informatics for mouse models, phenotypes, and gene homologs.

Usage: 6
Performance: 1.0

Pathway Commons Search

pathway_analysis

Search Pathway Commons for pathways across Reactome, KEGG, PANTHER, NCI-PID

Usage: 6
Performance: 1.0

STITCH Chemical Interactions

network_analysis

Query STITCH database for chemical-protein interactions. Aggregates data from experiments, databases, text mining, and predictions.

Usage: 6
Performance: 1.0

BindingDB Binding Affinity

drug_target_data

Query BindingDB for protein-ligand binding affinity measurements (Ki, Kd, IC50, EC50) from the primary literature. Accepts gene symbol or UniProt accession. Returns compound names, affinities, SMILES, PubMed IDs, and BindingDB URLs. Supports optional max_ic50_nm filter for potent binders. Essential for drug target prioritization (BACE1, LRRK2, GSK3B, CDK5).

Usage: 5
Performance: 1.0

ClinGen Gene-Disease Validity

genetic_disease

Query ClinGen for curated gene-disease validity classifications. Expert panels rate evidence from Definitive to Refuted. Gold-standard resource for determining whether a gene causes a specific disease.

Usage: 5
Performance: 1.0

COSMIC Gene Mutations

cancer_genomics

Query COSMIC (via Open Targets) for somatic cancer mutations. Returns cancer associations with somatic mutation evidence scores from Cancer Gene Census and IntOGen.

Usage: 5
Performance: 1.0

IntAct Molecular Interactions

interaction_data

Query EBI IntAct for experimentally validated molecular interactions. Aggregates data from BioGRID, MINT, DIP. Returns interactors with detection methods and publication evidence.

Usage: 5
Performance: 1.0

JASPAR TF Binding Sites

regulatory_analysis

Query JASPAR database for transcription factor binding motifs and regulatory elements.

Usage: 5
Performance: 1.0

PharmGKB Pharmacogenomics

drug_database

Query PharmGKB for pharmacogenomics drug-gene relationships. Returns clinical annotations linking genetic variants to drug response (efficacy, toxicity, dosage), level of evidence (1A–4), and CPIC guidelines.

Usage: 5
Performance: 1.0

HGNC Gene Nomenclature

gene_annotation

Query HUGO Gene Nomenclature Committee (HGNC) for authoritative gene names, aliases, previous symbols, gene family membership, locus type (protein-coding/lncRNA/pseudogene), chromosomal location, and cross-references to Ensembl/UniProt/OMIM/RefSeq/MGI. Gold-standard for gene symbol disambiguation and family classification.

Usage: 4
Performance: 1.0

Monarch Disease-Gene Associations

gene_disease_association

Query Monarch Initiative for disease-gene-phenotype associations. Integrates OMIM, ClinVar, HPO, MGI, ZFIN data. Supports disease→gene and gene→disease queries.

Usage: 4
Performance: 1.0

Agora AMP-AD Target Scoring

disease_genetics

Query Agora (AMP-AD Knowledge Portal, Sage Bionetworks) for Alzheimer's Disease multi-omic gene target scoring. Integrates RNA-seq, proteomics, metabolomics, network centrality, and genetic risk evidence across AD brain datasets. Returns nomination status, expression changes in AD brain (RNA + protein), genetic risk association (IGAP), and evidence across modalities. Essential for AD target prioritization.

Usage: 3
Performance: 1.0

AGR Gene Orthologs

comparative_genomics

Query Alliance of Genome Resources (AGR) for cross-species gene orthologs using 8 integrated algorithms (OrthoFinder, PANTHER, Ensembl Compara, OMA, InParanoid, Phylome, OrthoMCL, HGNC). Returns orthologs in mouse, rat, zebrafish, fly, worm, and yeast with prediction method count and best-score flag. Critical for identifying model organism experiments that validate neurodegeneration mechanisms.

Usage: 3
Performance: 1.0

BioStudies Dataset Search

dataset_discovery

Search EBI BioStudies and ArrayExpress for transcriptomics, proteomics, and functional genomics datasets. Returns accession numbers, organism, study type. Complements NCBI GEO with European datasets and ArrayExpress studies.

Usage: 3
Performance: 1.0
Example Queries
→ TREM2 Alzheimer
→ Alzheimer disease neurodegeneration
→ Alzheimer disease neurodegeneration

EBI Complex Portal

protein_interaction

Query EBI Complex Portal for experimentally validated protein complexes. Returns complex membership, subunit lists, and molecular functions. Essential for mechanistic interpretation of disease-associated proteins.

Usage: 3
Performance: 1.0

EBI OLS Term Lookup

ontology

Search EBI Ontology Lookup Service (OLS4) across 300+ biomedical ontologies for disease, phenotype, molecular function, or chemical terms. Resolves free-text names to canonical ontology IDs: HPO (HP:), Disease Ontology (DOID:), MONDO, EFO, Gene Ontology (GO:), ChEBI. Essential for analyses that need standard identifiers for diseases or biological processes. Supports filtering by ontology and exact-match queries. Returns IDs, labels, descriptions, and synonyms.

Usage: 3
Performance: 1.0
Example Queries
→ Alzheimer disease
→ Alzheimer disease neurodegeneration
→ Alzheimer disease neurodegeneration

ENCODE Regulatory Search

regulatory_genomics

Search ENCODE for epigenomics experiments (ChIP-seq, ATAC-seq, Hi-C) targeting a gene. Returns released experiments with biosample, assay type, and accession IDs. Critical for interpreting non-coding GWAS variants by finding TF binding sites and chromatin accessibility near neurodegeneration risk loci.

Usage: 3
Performance: 1.0

Ensembl Regulatory Features

regulatory_genomics

Query Ensembl Regulatory Build for regulatory elements in the genomic neighborhood of a gene. Returns promoters, enhancers, CTCF binding sites, and open chromatin regions derived from ENCODE/Roadmap data. Critical for interpreting non-coding GWAS variants near neurodegeneration loci (BIN1, CLU, PICALM, APOE). Reveals the regulatory landscape that controls gene expression.

Usage: 3
Performance: 1.0

Expression Atlas Differential

expression_data

Query EMBL-EBI Expression Atlas for differential expression experiments. Returns experiments where a gene is differentially expressed across conditions, tissues, and diseases.

Usage: 3
Performance: 1.0

GTEx Brain eQTLs

genetics_data

Query GTEx v8 for cis-eQTLs in brain tissues: genetic variants that regulate gene expression in frontal cortex, hippocampus, substantia nigra and 6 other brain regions. Connects GWAS hits to causal gene regulation.

Usage: 3
Performance: 1.0

IMPC Mouse Phenotypes

model_organism

Query IMPC (International Mouse Phenotyping Consortium) for statistically significant phenotypes observed in knockout mice (p<0.0001). Covers 20+ biological systems including neurological, cardiovascular, metabolic, and immune phenotypes. Provides direct in vivo evidence of gene function — essential for validating disease hypotheses about neurodegeneration genes like TREM2, GBA, LRRK2, and PSEN1.

Usage: 3
Performance: 1.0

JensenLab DISEASES Text Mining

disease_genetics

Query JensenLab DISEASES (STRING group) for text-mining gene-disease confidence scores derived from 500M+ MEDLINE abstracts. Provides independent evidence scores complementary to DisGeNET and ClinGen. Covers rare and common diseases with automated mining of scientific literature. High-confidence associations reflect strong co-mention signals across publications.

Usage: 3
Performance: 1.0

MSigDB Gene Sets

gene_set_enrichment

Query MSigDB gene set membership for a gene via Enrichr genemap. Returns which Hallmark, KEGG, Reactome, WikiPathways, and GO gene sets directly contain the query gene. Complements enrichr_analyze (which runs enrichment on a gene list) by answering 'which named pathways include this gene?' — useful for contextualizing a single candidate gene.

Usage: 3
Performance: 1.0

NCBI Gene Summary

gene_annotation

Fetch official NCBI gene summary, aliases, chromosomal location, and MIM IDs via the NCBI Datasets v2 API. Returns a concise functional overview from RefSeq curators — useful as a first-pass description of gene function before querying pathway or disease databases. Covers all human genes including rare neurodegeneration-associated genes.

Usage: 3
Performance: 1.0

NIH RePORTER Projects

funding_landscape

Search NIH RePORTER for funded research grants on a topic. Returns project titles, PIs, institutions, award amounts, and abstract excerpts. Useful for mapping the funding landscape for neurodegeneration research areas, identifying active investigators, and understanding NIH research priorities. Covers all NIH institutes including NIA, NINDS, and NIMH.

Usage: 3
Performance: 1.0
Example Queries
→ TREM2 Alzheimer
→ Alzheimer disease neurodegeneration
→ Alzheimer disease neurodegeneration

OmniPath Signaling

network_analysis

Query OmniPath for directed signaling interactions and post-translational modifications (PTMs). Integrates 100+ databases (BioGRID, HPRD, PhosphoSitePlus, SignaLink, Reactome, CellTalkDB) with stimulation/inhibition directionality. Unlike STRING, focuses on regulatory direction — which kinase activates which receptor. Includes ligand-receptor interactions for cell-cell communication.

Usage: 3
Performance: 1.0

Open Targets Drugs

drug_target_data

Query Open Targets Platform for drugs targeting a gene, with clinical phases, mechanisms of action, and disease indications. Focuses on clinical evidence and approved drugs — complements ChEMBL bioactivity data for drug repurposing.

Usage: 3
Performance: 1.0

Open Targets Genetics L2G

genetic_associations

Query Open Targets Genetics for GWAS loci linked to a gene via Locus-to-Gene (L2G) ML scoring. Maps genetic variants to probable causal genes using eQTLs, chromatin accessibility, and functional genomics. Essential for interpreting AD/PD GWAS hits.

Usage: 3
Performance: 1.0

Open Targets RNA Expression

gene_expression

Query Open Targets for baseline RNA expression across 100+ tissues for a gene, aggregated from GTEx and Expression Atlas. Returns TPM-like values and z-scores per tissue. Includes a brain_only flag to filter to CNS/neuronal tissues, making it directly useful for neurodegeneration context: microglia (TREM2), neurons (SNCA/MAPT), oligodendrocytes. Complements gtex_eqtl (genetic effects) and bgee_expression (developmental) with baseline expression profiles.

Usage: 3
Performance: 1.0

Open Targets Tractability

drug_discovery

Query Open Targets Platform for drug tractability and modality assessments. Returns tractability buckets for small molecules (clinical/preclinical precedent, structural features), antibodies (membrane protein evidence), and other modalities. Critical for prioritizing therapeutic targets — especially relevant for neurodegeneration targets like LRRK2 (kinase), TREM2 (receptor), and GBA (enzyme).

Usage: 3
Performance: 1.0

PanglaoDB Cell Markers

cell_type_annotation

Get canonical cell type marker genes from PanglaoDB scRNA-seq database. Covers microglia, astrocytes, neurons, OPCs, DAM, oligodendrocytes, endothelial, pericytes.

Usage: 3
Performance: 1.0

PDB Protein Structures

structure_data

Search RCSB Protein Data Bank for experimental protein structures (X-ray, cryo-EM, NMR). Complements AlphaFold predictions with experimentally validated structural data including drug binding sites.

Usage: 3
Performance: 1.0

Pharos Target Development

drug_target_data

Query NIH Pharos TCRD for drug target development level (TDL): Tclin (approved drugs), Tchem (bioactive molecules), Tbio (biological knowledge only), Tdark (poorly characterized). Returns target family, disease associations, drugs and bioactive compounds. Essential for drug target prioritization in neurodegeneration.

Usage: 3
Performance: 1.0

WikiPathways Gene Pathways

pathway_analysis

Query WikiPathways community pathway database for biological pathways containing a gene. Complements Reactome and KEGG with community-curated pathways including disease-specific and rare pathway annotations. Returns pathway IDs, names, species, and visualization URLs.

Usage: 3
Performance: 1.0

ChEMBL Compound Search

compound_annotation

Search ChEMBL (EBI) for small molecules, drugs, and chemical probes by name. Returns structural information, clinical development phase, ATC pharmacological classification, molecular properties (MW, LogP, HBD/HBA), and InChI key. Compound-centric complement to chembl_drug_targets (gene-centric). Useful for looking up a drug candidate's ChEMBL ID and clinical phase, or finding related compound series.

Usage: 2
Performance: 1.0

CrossRef Paper Metadata

literature_fetch

Retrieve publication metadata from CrossRef by DOI. Returns title, authors, journal, year, citation count, publisher, open access PDF link, subject areas, and funders. CrossRef indexes 150M+ scholarly works across all publishers — covers papers from any journal (not just biomedical), including clinical trials, meta-analyses, and specialized neuro-chemistry publications not in PubMed. Essential for DOI-based paper lookup when PMID is unavailable.

Usage: 2
Performance: 1.0

GTEx Brain sQTLs

expression_qtl

Query GTEx v8 for splicing QTLs (sQTLs) in brain: genetic variants that alter RNA splicing patterns across 9 brain regions. Complementary to eQTLs — many disease risk variants act by changing splice junction usage rather than expression level. Key examples: MAPT H1/H2 haplotype sQTLs (tau isoform switching), BIN1 hippocampal sQTLs. Returns per-tissue splice junction associations with variant IDs.

Usage: 2
Performance: 1.0

Harmonizome Gene Sets

gene_set_enrichment

Query Harmonizome for gene-associated datasets across 114 gene-set libraries (KEGG, Reactome, OMIM, GO, GTEx, GWAS, ChEMBL, etc.). Enables rapid cross-database characterization of any gene.

Usage: 2
Performance: 1.0

MethBase Age Correlation

epigenetic_analysis

Get age-related methylation changes

Usage: 2
Performance: 0.3

Open Targets Mouse Phenotypes

model_organism

Query Open Targets for mouse model phenotypes associated with a gene. Returns experimentally observed phenotypes from knockout/transgenic mouse models aggregated from IMPC, MGI, and other sources. Provides phenotype class summaries (neurological, behavioral, metabolic, etc.) and biological model details (allelic composition, genetic background, literature). Useful for assessing whether modifying a gene has neurological or disease-relevant consequences in preclinical models.

Usage: 2
Performance: 1.0

UniProt PTM Features

protein_annotation

Retrieve UniProt/Swiss-Prot curated protein feature annotations for a gene: post-translational modifications (phosphorylation, ubiquitination, acetylation, methylation), active sites, binding sites, glycosylation, disulfide bonds, signal peptides, and natural variants. Especially valuable for neurodegeneration research: tau (MAPT) phosphorylation landscape, alpha-synuclein (SNCA) PTMs, APP cleavage sites. Returns structured feature list with positions and evidence counts.

Usage: 2
Performance: 1.0

Ensembl VEP Variant Annotation

variant_annotation

Annotate genetic variants using Ensembl Variant Effect Predictor (VEP). Accepts dbSNP rsIDs or HGVS notation and returns predicted molecular consequences (missense, splice, frameshift), SIFT/PolyPhen-2 pathogenicity scores, amino acid changes, and impact classification (HIGH/MODERATE/LOW/MODIFIER). Essential for interpreting neurodegeneration GWAS variants: APOE4=rs429358, LRRK2 G2019S=rs34637584, GBA N370S=rs76763715.

Usage: 1
Performance: 1.0

Europe PMC Citations

literature_search

Get articles that cite a specific paper via Europe PMC

Usage: 1
Performance: 0.9
Example Queries
→ 31474370

ProteomicsDB Protein Expression

expression_data

Query ProteomicsDB (TUM/Kuster lab) for mass spectrometry-based protein abundance across human tissues. Measures actual protein levels (iBAQ normalized intensity) across 60+ tissues/fluids. Complements RNA expression atlases (GTEx, HPA) with protein-level data — critical because mRNA and protein levels often diverge. Accepts gene symbol or UniProt accession.

Usage: 1
Performance: 1.0

adaptyv

lab_automation

How to use the Adaptyv Bio Foundry API and Python SDK for protein experiment design, submission, and results retrieval. Use this skill whenever the user mentions Adaptyv, Foundry API, protein binding assays, protein screening experiments, BLI/SPR assays, thermostability assays, or wants to submit protein sequences for experimental characterization. Also trigger when code imports `adaptyv`, `adaptyv_sdk`, or `FoundryClient`, or references `foundry-api-public.adaptyvbio.com`.

Usage: 0
Performance: 1.0

aeon

ml_ai

This skill should be used for time series machine learning tasks including classification, regression, clustering, forecasting, anomaly detection, segmentation, and similarity search. Use when working with temporal data, sequential patterns, or time-indexed observations requiring specialized algorithms beyond standard ML approaches. Particularly suited for univariate and multivariate time series analysis with scikit-learn compatible APIs.

Usage: 0
Performance: 1.0

Allen Brain Expression

data_retrieval

Query Allen Brain Atlas for ISH expression data across brain regions.

Usage: 0
Performance: 0.1

anndata

bioinformatics

Data structure for annotated matrices in single-cell analysis. Use when working with .h5ad files or integrating with the scverse ecosystem. This is the data format skill—for analysis workflows use scanpy; for probabilistic models use scvi-tools; for population-scale queries use cellxgene-census.

Usage: 0
Performance: 1.0

arboreto

bioinformatics

Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3). Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions. Supports distributed computation for large-scale datasets.

Usage: 0
Performance: 1.0

astropy

physics

Comprehensive Python library for astronomy and astrophysics. This skill should be used when working with astronomical data including celestial coordinates, physical units, FITS files, cosmological calculations, time systems, tables, world coordinate systems (WCS), and astronomical data analysis. Use when tasks involve coordinate transformations, unit conversions, FITS file manipulation, cosmological distance calculations, time scale conversions, or astronomical data processing.

Usage: 0
Performance: 1.0

benchling-integration

lab_automation

Benchling R&D platform integration. Access registry (DNA, proteins), inventory, ELN entries, workflows via API, build Benchling Apps, query Data Warehouse, for lab data management automation.

Usage: 0
Performance: 1.0

bgpt-paper-search

scientific_comm

Search scientific papers and retrieve structured experimental data extracted from full-text studies via the BGPT MCP server. Returns 25+ fields per paper including methods, results, sample sizes, quality scores, and conclusions. Use for literature reviews, evidence synthesis, and finding experimental details not available in abstracts alone.

Usage: 0
Performance: 1.0

biopython

bioinformatics

Comprehensive molecular biology toolkit. Use for sequence manipulation, file parsing (FASTA/GenBank/PDB), phylogenetics, and programmatic NCBI/PubMed access (Bio.Entrez). Best for batch processing, custom bioinformatics pipelines, BLAST automation. For quick lookups use gget; for multi-service integration use bioservices.

Usage: 0
Performance: 1.0

bioservices

general

Unified Python interface to 40+ bioinformatics services. Use when querying multiple databases (UniProt, KEGG, ChEMBL, Reactome) in a single workflow with consistent API. Best for cross-database analysis, ID mapping across services. For quick single-database lookups use gget; for sequence/file manipulation use biopython.

Usage: 0
Performance: 1.0

CellxGene Cell Type Expression

single_cell_expression

Query CZ CELLxGENE Discover 'Where is My Gene' (WMG v2) API for quantitative single-cell gene expression across human cell types. Returns mean log2(CPM+1) expression, percent expressing cells, and total cell count per cell-type per tissue. Brain-relevant cell types (microglia, neurons, astrocytes, oligodendrocytes, pericytes) are prioritized. Distinct from cellxgene_gene_expression which only lists datasets. Critical for neurodegeneration: understanding which brain cell types express TREM2/APOE/APP/SNCA/MAPT at the single-cell level.

Usage: 0
Performance: 1.0

cellxgene-census

bioinformatics

Query the CELLxGENE Census (61M+ cells) programmatically. Use when you need expression data across tissues, diseases, or cell types from the largest curated single-cell atlas. Best for population-scale queries, reference atlas comparisons. For analyzing your own data use scanpy or scvi-tools.

Usage: 0
Performance: 1.0

ChEMBL Drug Targets

data_retrieval

Drug compounds and bioactivity data for a gene target from the ChEMBL database of bioactive molecules.

Usage: 0
Performance: 0.1

cirq

quantum

Google quantum computing framework. Use when targeting Google Quantum AI hardware, designing noise-aware circuits, or running quantum characterization experiments. Best for Google hardware, noise modeling, and low-level circuit design. For IBM hardware use qiskit; for quantum ML with autodiff use pennylane; for physics simulations use qutip.

Usage: 0
Performance: 1.0

citation-management

scientific_comm

Comprehensive citation management for academic research. Search Google Scholar and PubMed for papers, extract accurate metadata, validate citations, and generate properly formatted BibTeX entries. This skill should be used when you need to find papers, verify citation information, convert DOIs to BibTeX, or ensure reference accuracy in scientific writing.

Usage: 0
Performance: 1.0

CIViC Gene Variants

clinical_genetics

Get CIViC (Clinical Interpretation of Variants in Cancer) expert-curated clinical variant interpretations for a gene. Returns variant names, HGVS expressions, variant types, and CIViC URLs. Useful for variant effect classification, functional evidence in overlapping cancer/neurodegeneration genes (IDH1, PTEN, ATM, BRCA2, TP53), and mechanistic evidence for rare disease variant interpretation.

Usage: 0
Performance: 1.0

ClinGen Gene-Disease Validity

genetic_disease

Query ClinGen for curated gene-disease validity classifications.

Usage: 0
Performance: 1.0

clinical-decision-support

clinical

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug development, clinical research, and evidence synthesis.

Usage: 0
Performance: 1.0

clinical-reports

clinical

Write comprehensive clinical reports including case reports (CARE guidelines), diagnostic reports (radiology/pathology/lab), clinical trial reports (ICH-E3, SAE, CSR), and patient documentation (SOAP, H&P, discharge summaries). Full support with templates, regulatory compliance (HIPAA, FDA, ICH-GCP), and validation tools.

Usage: 0
Performance: 1.0

ClinicalTrials.gov Search

clinical_data

Search ClinicalTrials.gov for clinical trials related to genes, diseases, or interventions. Returns NCT IDs, status, phase, conditions, interventions, enrollment, and sponsor info.

Usage: 0
Performance: 1.0

ClinVar Variants

data_retrieval

Fetch clinical genetic variants from NCBI ClinVar. Returns pathogenicity, review status, and associated conditions.

Usage: 0
Performance: 0.1

cobrapy

cheminformatics

Constraint-based metabolic modeling (COBRA). FBA, FVA, gene knockouts, flux sampling, SBML models, for systems biology and metabolic engineering analysis.

Usage: 0
Performance: 1.0

consciousness-council

scientific_comm

Run a multi-perspective Mind Council deliberation on any question, decision, or creative challenge. Use this skill whenever the user wants diverse viewpoints, needs help making a tough decision, asks for a council/panel/board discussion, wants to explore a problem from multiple angles, requests devil's advocate analysis, or says things like "what would different experts think about this", "help me think through this from all sides", "council mode", "mind council", or "deliberate on this". Also trigger when the user faces a dilemma, trade-off, or complex choice with no obvious answer.

Usage: 0
Performance: 1.0

CrossRef Preprint Search

literature_search

Search for bioRxiv/medRxiv preprints via CrossRef API. Returns preprints indexed by CrossRef with title, authors, DOI, posted date, and abstract snippet. Useful for finding the latest findings before peer review, especially for fast-moving topics like tau pathology, neuroinflammation, TREM2 biology, Parkinson's alpha-synuclein research, and ALS/FTD mechanisms.

Usage: 0
Performance: 1.0

dask

data_analysis

Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.

Usage: 0
Performance: 1.0

database-lookup

database_access

Search 78 public scientific, biomedical, materials science, and economic databases via REST APIs. Covers physics/astronomy (NASA, NIST, SDSS, SIMBAD), earth/environment (USGS, NOAA, EPA), chemistry/drugs (PubChem, ChEMBL, DrugBank, FDA, KEGG, ZINC, BindingDB), materials (Materials Project, COD), biology/genomics (Reactome, UniProt, STRING, Ensembl, NCBI Gene, GEO, GTEx, PDB, AlphaFold, InterPro, BioGRID, Gene Ontology, dbSNP, gnomAD, ENCODE, Human Protein Atlas, Human Cell Atlas), disease/clinical (COSMIC, Open Targets, ClinicalTrials.gov, OMIM, ClinVar, GDC/TCGA, cBioPortal, DisGeNET, GWAS Catalog), regulatory (FDA, USPTO, SEC EDGAR), economics/finance (FRED, World Bank, US Treasury), demographics (US Census, Eurostat, WHO). Use when looking up compounds, genes, proteins, pathways, variants, clinical trials, patents, economic indicators, or any public database API query.

Usage: 0
Performance: 1.0

datamol

cheminformatics

Pythonic wrapper around RDKit with simplified interface and sensible defaults. Preferred for standard drug discovery including SMILES parsing, standardization, descriptors, fingerprints, clustering, 3D conformers, parallel processing. Returns native rdkit.Chem.Mol objects. For advanced control or custom parameters, use rdkit directly.

Usage: 0
Performance: 1.0

deepchem

cheminformatics

Molecular ML with diverse featurizers and pre-built datasets. Use for property prediction (ADMET, toxicity) with traditional ML or GNNs when you want extensive featurization options and MoleculeNet benchmarks. Best for quick experiments with pre-trained models, diverse molecular representations. For graph-first PyTorch workflows use torchdrug; for benchmark datasets use pytdc.

Usage: 0
Performance: 1.0

deeptools

bioinformatics

NGS analysis toolkit. BAM to bigWig conversion, QC (correlation, PCA, fingerprints), heatmaps/profiles (TSS, peaks), for ChIP-seq, RNA-seq, ATAC-seq visualization.

Usage: 0
Performance: 1.0

depmap

clinical

Query the Cancer Dependency Map (DepMap) for cancer cell line gene dependency scores (CRISPR Chronos), drug sensitivity data, and gene effect profiles. Use for identifying cancer-specific vulnerabilities, synthetic lethal interactions, and validating oncology drug targets.

Usage: 0
Performance: 1.0

DGIdb Drug-Gene Interactions

drug_database

Query DGIdb for drug-gene interactions and druggability categories. Aggregates DrugBank, PharmGKB, TTD, ChEMBL, and clinical guideline data.

Usage: 0
Performance: 1.0

dhdna-profiler

scientific_comm

Extract cognitive patterns and thinking fingerprints from any text. Use this skill when the user wants to analyze how someone thinks, understand cognitive style, profile writing or speech patterns, compare thinking styles between people, asks "what's my thinking style", "analyze how this person reasons", "cognitive profile", "thinking pattern", "DHDNA", "digital DNA", or wants to understand the mind behind any text. Also trigger when the user provides text and wants deeper insight into the author's reasoning patterns, decision-making style, or cognitive signature.

Usage: 0
Performance: 1.0

diffdock

cheminformatics

Diffusion-based molecular docking. Predict protein-ligand binding poses from PDB/SMILES, confidence scores, virtual screening, for structure-based drug design. Not for affinity prediction.

Usage: 0
Performance: 1.0

DisGeNET Disease-Gene

disease_gene_association

Get genes associated with a disease from DisGeNET with association scores.

Usage: 0
Performance: 1.0

DisGeNET Disease Similarity

disease_genetics

Find diseases related to a query disease based on shared gene-disease associations in DisGeNET. Uses top-scoring genes for the query disease to identify comorbid or mechanistically related conditions. Helps map shared molecular substrates across neurodegeneration (e.g. diseases sharing APOE, LRRK2, or GBA pathways). Requires DISGENET_API_KEY for full results.

Usage: 0
Performance: 1.0

DisGeNET Gene-Disease

gene_disease_association

Get disease associations for a gene from DisGeNET with scores and supporting PMIDs.

Usage: 0
Performance: 1.0

dnanexus-integration

lab_automation

DNAnexus cloud genomics platform. Build apps/applets, manage data (upload/download), dxpy Python SDK, run workflows, FASTQ/BAM/VCF, for genomics pipeline development and execution.

Usage: 0
Performance: 1.0

docx

data_analysis

Use this skill whenever the user wants to create, read, edit, or manipulate Word documents (.docx files). Triggers include: any mention of 'Word doc', 'word document', '.docx', or requests to produce professional documents with formatting like tables of contents, headings, page numbers, or letterheads. Also use when extracting or reorganizing content from .docx files, inserting or replacing images in documents, performing find-and-replace in Word files, working with tracked changes or comments, or converting content into a polished Word document. If the user asks for a 'report', 'memo', 'letter', 'template', or similar deliverable as a Word or .docx file, use this skill. Do NOT use for PDFs, spreadsheets, Google Docs, or general coding tasks unrelated to document generation.

Usage: 0
Performance: 1.0

EBI eQTL Catalog

expression_qtl

Query EBI eQTL Catalog v2 for tissue/cell-type-specific expression QTLs. Covers 100+ studies beyond GTEx including microglia, iPSC-derived neurons, brain organoids, and disease cohorts. Returns rsid, variant, p-value, beta, SE, MAF, tissue_label, and qtl_group. Supports optional tissue_keyword filter (e.g. 'microglia', 'brain', 'iPSC').

Usage: 0
Performance: 1.0

Enrichr GO Enrichment

data_retrieval

Gene set enrichment against GO Biological Process. Enter a gene list to find enriched pathways.

Usage: 0
Performance: 0.1

Ensembl Gene Phenotype Associations

gene_disease

Retrieve multi-source phenotype and disease associations for a gene via Ensembl REST API. Aggregates disease-gene links from Orphanet, OMIM/MIM morbid, NHGRI-EBI GWAS Catalog, DECIPHER, ClinVar, and UniProtKB into a unified view with MONDO/HP/Orphanet ontology accessions and data source provenance. Distinct from omim_gene_phenotypes (OMIM only), clinvar_variants (variant-level), and monarch_disease_genes (Monarch KG). Useful for comprehensive disease context, rare disease associations via Orphanet, and multi-source phenotype validation.

Usage: 0
Performance: 1.0

esm

protein_engineering

Comprehensive toolkit for protein language models including ESM3 (generative multimodal protein design across sequence, structure, and function) and ESM C (efficient protein embeddings and representations). Use this skill when working with protein sequences, structures, or function prediction; designing novel proteins; generating protein embeddings; performing inverse folding; or conducting protein engineering tasks. Supports both local model usage and cloud-based Forge API for scalable inference.

Usage: 0
Performance: 1.0

etetoolkit

research_methodology

Phylogenetic tree toolkit (ETE). Tree manipulation (Newick/NHX), evolutionary event detection, orthology/paralogy, NCBI taxonomy, visualization (PDF/SVG), for phylogenomics.

Usage: 0
Performance: 1.0

exploratory-data-analysis

data_analysis

Perform comprehensive exploratory data analysis on scientific data files across 200+ file formats. This skill should be used when analyzing any scientific data file to understand its structure, content, quality, and characteristics. Automatically detects file type and generates detailed markdown reports with format-specific analysis, quality metrics, and downstream analysis recommendations. Covers chemistry, bioinformatics, microscopy, spectroscopy, proteomics, metabolomics, and general scientific data formats.

Usage: 0
Performance: 1.0

FinnGen Disease Loci

genetic_associations

Query FinnGen R10 (N=520,000 Finns, 2,408 endpoints) for fine-mapped genetic loci. Returns SuSiE credible sets with lead SNP, gene, p-value, beta, and cross-trait annotations.

Usage: 0
Performance: 0.9

flowio

bioinformatics

Parse FCS (Flow Cytometry Standard) files v2.0-3.1. Extract events as NumPy arrays, read metadata/channels, convert to CSV/DataFrame, for flow cytometry data preprocessing.

Usage: 0
Performance: 1.0

fluidsim

engineering

Framework for computational fluid dynamics simulations using Python. Use when running fluid dynamics simulations including Navier-Stokes equations (2D/3D), shallow water equations, stratified flows, or when analyzing turbulence, vortex dynamics, or geophysical flows. Provides pseudospectral methods with FFT, HPC support, and comprehensive output analysis.

Usage: 0
Performance: 1.0

Gene Info

data_retrieval

Look up any human gene — returns full name, summary, aliases, and gene type from MyGene.info.

Usage: 0
Performance: 0.1

generate-image

scientific_comm

Generate or edit images using AI models (FLUX, Nano Banana 2). Use for general-purpose image generation including photos, illustrations, artwork, visual assets, concept art, and any image that is not a technical diagram or schematic. For flowcharts, circuits, pathways, and technical diagrams, use the scientific-schematics skill instead.

Usage: 0
Performance: 1.0

geniml

bioinformatics

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

Usage: 0
Performance: 1.0

Genomics England PanelApp

clinical_genetics

Query Genomics England PanelApp for disease gene panel memberships. Used by NHS Genomic Medicine Service for rare disease diagnosis. Returns panel name, disease group, confidence level (green/amber/red), mode of inheritance, penetrance, phenotypes, and gene evidence. Covers hereditary dementias, Parkinson, motor neuron disease, ataxias, and leukodystrophies.

Usage: 0
Performance: 1.0

geomaster

geospatial

Comprehensive geospatial science skill covering remote sensing, GIS, spatial analysis, machine learning for earth observation, and 30+ scientific domains. Supports satellite imagery processing (Sentinel, Landsat, MODIS, SAR, hyperspectral), vector and raster data operations, spatial statistics, point cloud processing, network analysis, cloud-native workflows (STAC, COG, Planetary Computer), and 8 programming languages (Python, R, Julia, JavaScript, C++, Java, Go, Rust) with 500+ code examples. Use for remote sensing workflows, GIS analysis, spatial ML, Earth observation data processing, terrain analysis, hydrological modeling, marine spatial analysis, atmospheric science, and any geospatial computation task.

Usage: 0
Performance: 1.0

geopandas

geospatial

Python library for working with geospatial vector data including shapefiles, GeoJSON, and GeoPackage files. Use when working with geographic data for spatial analysis, geometric operations, coordinate transformations, spatial joins, overlay operations, choropleth mapping, or any task involving reading/writing/analyzing vector geographic data. Supports PostGIS databases, interactive maps, and integration with matplotlib/folium/cartopy. Use for tasks like buffer analysis, spatial joins between datasets, dissolving boundaries, clipping data, calculating areas/distances, reprojecting coordinate systems, creating maps, or converting between spatial file formats.

Usage: 0
Performance: 1.0

get-available-resources

infrastructure

This skill should be used at the start of any computationally intensive scientific task to detect and report available system resources (CPU cores, GPUs, memory, disk space). It creates a JSON file with resource information and strategic recommendations that inform computational approach decisions such as whether to use parallel processing (joblib, multiprocessing), out-of-core computing (Dask, Zarr), GPU acceleration (PyTorch, JAX), or memory-efficient strategies. Use this skill before running analyses, training models, processing large datasets, or any task where resource constraints matter.

Usage: 0
Performance: 1.0

gget

bioinformatics

Fast CLI/Python queries to 20+ bioinformatics databases. Use for quick lookups: gene info, BLAST searches, AlphaFold structures, enrichment analysis. Best for interactive exploration, simple queries. For batch processing or advanced BLAST use biopython; for multi-database Python workflows use bioservices.

Usage: 0
Performance: 1.0

ginkgo-cloud-lab

lab_automation

Submit and manage protocols on Ginkgo Bioworks Cloud Lab (cloud.ginkgo.bio), a web-based interface for autonomous lab execution on Reconfigurable Automation Carts (RACs). Use when the user wants to run cell-free protein expression (validation or optimization), generate fluorescent pixel art, or interact with Ginkgo Cloud Lab services. Covers protocol selection, input preparation, pricing, and ordering workflows.

Usage: 0
Performance: 1.0

glycoengineering

protein_engineering

Analyze and engineer protein glycosylation. Scan sequences for N-glycosylation sequons (N-X-S/T), predict O-glycosylation hotspots, and access curated glycoengineering tools (NetOGlyc, GlycoShield, GlycoWorkbench). For glycoprotein engineering, therapeutic antibody optimization, and vaccine design.

Usage: 0
Performance: 1.0

gProfiler Gene Enrichment

pathway_analysis

Functional enrichment analysis via g:Profiler (ELIXIR bioinformatics platform). Tests a gene list against GO Biological Process, GO Molecular Function, GO Cellular Component, KEGG, Reactome, WikiPathways, Human Phenotype Ontology, and TRANSFAC databases. Complements Enrichr with different statistical correction (g:SCS method), independent database versions, and broader ontology coverage including HP phenotypes and miRTarBase targets.

Usage: 0
Performance: 1.0

gtars

bioinformatics

High-performance toolkit for genomic interval analysis in Rust with Python bindings. Use when working with genomic regions, BED files, coverage tracks, overlap detection, tokenization for ML models, or fragment analysis in computational genomics and machine learning applications.

Usage: 0
Performance: 1.0

GWAS Catalog

data_retrieval

Genome-wide association study hits from the NHGRI-EBI GWAS Catalog. Query by gene or trait.

Usage: 0
Performance: 0.1

GWAS Catalog Variant Associations

genetic_associations

Query EBI GWAS Catalog for all phenotypic associations reported for a specific genetic variant (rsID). Returns distinct traits grouped by study count and best p-value. Complements gwas_genetic_associations (gene-centric): this tool lets you see the full phenotypic spectrum of a known variant. Essential for major neurodegeneration variants: APOE4=rs429358 (Alzheimer, lipids, cognition), LRRK2 G2019S=rs34637584 (Parkinson), GBA N370S=rs76763715.

Usage: 0
Performance: 1.0

histolab

clinical

Lightweight WSI tile extraction and preprocessing. Use for basic slide processing tissue detection, tile extraction, stain normalization for H&E images. Best for simple pipelines, dataset preparation, quick tile-based analysis. For advanced spatial proteomics, multiplexed imaging, or deep learning pipelines use pathml.

Usage: 0
Performance: 1.0

Human Protein Atlas

data_retrieval

Protein expression across human tissues and cell types from the Human Protein Atlas. Includes subcellular localisation.

Usage: 0
Performance: 0.1

hypogenic

multi_omics

Automated LLM-driven hypothesis generation and testing on tabular datasets. Use when you want to systematically explore hypotheses about patterns in empirical data (e.g., deception detection, content analysis). Combines literature insights with data-driven hypothesis testing. For manual hypothesis formulation use hypothesis-generation; for creative ideation use scientific-brainstorming.

Usage: 0
Performance: 1.0

hypothesis-generation

scientific_comm

Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechanisms, and design experiments to test them. Follows scientific method framework. For open-ended ideation use scientific-brainstorming; for automated LLM-driven hypothesis testing on datasets use hypogenic.

Usage: 0
Performance: 1.0

imaging-data-commons

clinical

Query and download public cancer imaging data from NCI Imaging Data Commons using idc-index. Use for accessing large-scale radiology (CT, MR, PET) and pathology datasets for AI training or research. No authentication required. Query by metadata, visualize in browser, check licenses.

Usage: 0
Performance: 1.0

infographics

scientific_comm

Create professional infographics using Nano Banana Pro AI with smart iterative refinement. Uses Gemini 3 Pro for quality review. Integrates research-lookup and web search for accurate data. Supports 10 infographic types, 8 industry styles, and colorblind-safe palettes.

Usage: 0
Performance: 1.0

iso-13485-certification

clinical

Comprehensive toolkit for preparing ISO 13485 certification documentation for medical device Quality Management Systems. Use when users need help with ISO 13485 QMS documentation, including (1) conducting gap analysis of existing documentation, (2) creating Quality Manuals, (3) developing required procedures and work instructions, (4) preparing Medical Device Files, (5) understanding ISO 13485 requirements, or (6) identifying missing documentation for medical device certification. Also use when users mention medical device regulations, QMS certification, FDA QMSR, EU MDR, or need help with quality system documentation.

Usage: 0
Performance: 1.0

KEGG Disease Genes

disease_gene_association

Query the KEGG Disease database for curated disease entries and their causal/associated genes. Returns KEGG disease IDs, gene lists with subtypes (e.g. AD1/APP, AD17/TREM2), approved drugs, and linked pathways. Disease-centric view distinct from kegg_pathways (gene-to-pathway). Valuable for getting the official KEGG gene list for neurodegeneration diseases with clinical subtypes.

Usage: 0
Performance: 1.0

labarchive-integration

lab_automation

Electronic lab notebook API integration. Access notebooks, manage entries/attachments, backup notebooks, integrate with Protocols.io/Jupyter/REDCap, for programmatic ELN workflows.

Usage: 0
Performance: 1.0

lamindb

multi_omics

This skill should be used when working with LaminDB, an open-source data framework for biology that makes data queryable, traceable, reproducible, and FAIR. Use when managing biological datasets (scRNA-seq, spatial, flow cytometry, etc.), tracking computational workflows, curating and validating data with biological ontologies, building data lakehouses, or ensuring data lineage and reproducibility in biological research. Covers data management, annotation, ontologies (genes, cell types, diseases, tissues), schema validation, integrations with workflow managers (Nextflow, Snakemake) and MLOps platforms (W&B, MLflow), and deployment strategies.

Usage: 0
Performance: 1.0

latchbio-integration

lab_automation

Latch platform for bioinformatics workflows. Build pipelines with Latch SDK, @workflow/@task decorators, deploy serverless workflows, LatchFile/LatchDir, Nextflow/Snakemake integration.

Usage: 0
Performance: 1.0

latex-posters

visualization

Create professional research posters in LaTeX using beamerposter, tikzposter, or baposter. Support for conference presentations, academic posters, and scientific communication. Includes layout design, color schemes, multi-column formats, figure integration, and poster-specific best practices for visual communication.

Usage: 0
Performance: 1.0

LIPID MAPS Lipid Search

metabolomics

Search LIPID MAPS for lipid structures, classifications, and biological roles via KEGG cross-reference. Returns LMID, formula, exact mass, main/sub class, InChIKey, HMDB/ChEBI IDs, and LIPID MAPS URL. Covers glycerophospholipids, sphingolipids, gangliosides, ceramides, oxysterols, and plasmalogens — all implicated in neurodegeneration (NPC, Alzheimer, Parkinson, ALS).

Usage: 0
Performance: 1.0

literature-review

scientific_comm

Conduct comprehensive, systematic literature reviews using multiple academic databases (PubMed, arXiv, bioRxiv, Semantic Scholar, etc.). This skill should be used when conducting systematic literature reviews, meta-analyses, research synthesis, or comprehensive literature searches across biomedical, scientific, and technical domains. Creates professionally formatted markdown documents and PDFs with verified citations in multiple citation styles (APA, Nature, Vancouver, etc.).

Usage: 0
Performance: 1.0

markdown-mermaid-writing

scientific_comm

Comprehensive markdown and Mermaid diagram writing skill. Use when creating any scientific document, report, analysis, or visualization. Establishes text-based diagrams as the default documentation standard with full style guides (markdown + mermaid), 24 diagram type references, and 9 document templates.

Usage: 0
Performance: 1.0

market-research-reports

scientific_comm

Generate comprehensive market research reports (50+ pages) in the style of top consulting firms (McKinsey, BCG, Gartner). Features professional LaTeX formatting, extensive visual generation with scientific-schematics and generate-image, deep integration with research-lookup for data gathering, and multi-framework strategic analysis including Porter Five Forces, PESTLE, SWOT, TAM/SAM/SOM, and BCG Matrix.

Usage: 0
Performance: 1.0

markitdown

data_analysis

Convert files and office documents to Markdown. Supports PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs and more.

Usage: 0
Performance: 1.0

matchms

proteomics

Spectral similarity and compound identification for metabolomics. Use for comparing mass spectra, computing similarity scores (cosine, modified cosine), and identifying unknown compounds from spectral libraries. Best for metabolite identification, spectral matching, library searching. For full LC-MS/MS proteomics pipelines use pyopenms.

Usage: 0
Performance: 1.0

matlab

engineering

MATLAB and GNU Octave numerical computing for matrix operations, data analysis, visualization, and scientific computing. Use when writing MATLAB/Octave scripts for linear algebra, signal processing, image processing, differential equations, optimization, statistics, or creating scientific visualizations. Also use when the user needs help with MATLAB syntax, functions, or wants to convert between MATLAB and Python code. Scripts can be executed with MATLAB or the open-source GNU Octave interpreter.

Usage: 0
Performance: 1.0

matplotlib

visualization

Low-level plotting library for full customization. Use when you need fine-grained control over every plot element, creating novel plot types, or integrating with specific scientific workflows. Export to PNG/PDF/SVG for publication. For quick statistical plots use seaborn; for interactive plots use plotly; for publication-ready multi-panel figures with journal styling, use scientific-visualization.

Usage: 0
Performance: 1.0

medchem

general

Medicinal chemistry filters. Apply drug-likeness rules (Lipinski, Veber), PAINS filters, structural alerts, complexity metrics, for compound prioritization and library filtering.

Usage: 0
Performance: 1.0

Metabolomics Workbench Search

dataset_discovery

Search the NIH Metabolomics Workbench for public metabolomics datasets. Returns study IDs, titles, species, analysis type (LC-MS, GC-MS, NMR), sample counts, and URLs. Useful for finding published CSF, brain region, or plasma metabolomics datasets for neurodegeneration diseases. Complements RNA/protein atlases with metabolite-level evidence.

Usage: 0
Performance: 1.0

MethBase Conservation

epigenetic_analysis

Query cross-species methylation conservation

Usage: 0
Performance: 0.3

MethBase CpG Islands

epigenetic_analysis

Get CpG island information for gene regions

Usage: 0
Performance: 0.3

MethBase Developmental Methylation

epigenetic_analysis

Query developmental stage methylation dynamics

Usage: 0
Performance: 0.3

MethBase Differential Analysis

epigenetic_analysis

Search differential methylation between conditions

Usage: 0
Performance: 0.3

MethBase Gene Methylation

epigenetic_search

Query DNA methylation studies for specific genes

Usage: 0
Performance: 0.3

MethBase Tissue Comparison

epigenetic_analysis

Compare methylation patterns across tissues

Usage: 0
Performance: 0.3

MethBase Tissue Methylation

epigenetic_analysis

Get tissue-specific methylation patterns

Usage: 0
Performance: 0.3

modal

infrastructure

Cloud computing platform for running Python on GPUs and serverless infrastructure. Use when deploying AI/ML models, running GPU-accelerated workloads, serving web endpoints, scheduling batch jobs, or scaling Python code to the cloud. Use this skill whenever the user mentions Modal, serverless GPU compute, deploying ML models to the cloud, serving inference endpoints, running batch processing in the cloud, or needs to scale Python workloads beyond their local machine. Also use when the user wants to run code on H100s, A100s, or other cloud GPUs, or needs to create a web API for a model.

Usage: 0
Performance: 1.0

molecular-dynamics

cheminformatics

Run and analyze molecular dynamics simulations with OpenMM and MDAnalysis. Set up protein/small molecule systems, define force fields, run energy minimization and production MD, analyze trajectories (RMSD, RMSF, contact maps, free energy surfaces). For structural biology, drug binding, and biophysics.

Usage: 0
Performance: 1.0

molfeat

cheminformatics

Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert SMILES to features, for QSAR and molecular ML.

Usage: 0
Performance: 1.0

Monarch Disease-Gene Associations

disease_gene_association

Query Monarch Initiative for disease-gene-phenotype associations from OMIM, ClinVar, HPO

Usage: 0
Performance: 0.8

NCBI dbSNP Variant Lookup

variant_annotation

Look up a variant in NCBI dbSNP for chromosomal position (GRCh38), reference/alternate alleles, functional class, gene context, and population allele frequencies from studies including 1000 Genomes, ALFA, and TOPMED. Complements Ensembl VEP (consequence prediction) and gnomAD (constraint) with official dbSNP registration metadata. Essential for characterizing a specific variant rsID.

Usage: 0
Performance: 1.0

NCBI GeneRIF Citations

literature_search

Retrieve NCBI Gene Reference Into Function (GeneRIF) linked publications for a gene. GeneRIF is a curated set of brief functional annotations each backed by a PubMed paper. Returns titles, journals, and URLs for the top-cited papers supporting gene function annotations — a fast path to high-quality mechanistic evidence for any neurodegeneration gene.

Usage: 0
Performance: 1.0

NCBI MeSH Term Lookup

ontology_lookup

Look up official NCBI Medical Subject Headings (MeSH) descriptors for diseases, pathways, and biological concepts. Returns the preferred descriptor name, hierarchical tree codes (e.g. C10.228.140.380 for Alzheimer Disease), scope note definition, and entry terms (all synonyms accepted by PubMed). Essential for standardizing terminology, discovering correct PubMed search terms, and navigating the MeSH vocabulary hierarchy from broad to specific.

Usage: 0
Performance: 1.0

NCBI SRA Search

dataset_discovery

Search the NCBI Sequence Read Archive (SRA) for public high-throughput sequencing datasets. Returns run accessions, titles, organisms, library strategy (RNA-Seq, ATAC-seq, ChIP-seq, scRNA-seq), and center names. Essential for finding publicly available neurodegeneration patient or model datasets for reanalysis.

Usage: 0
Performance: 1.0

networkx

visualization

Comprehensive toolkit for creating, analyzing, and visualizing complex networks and graphs in Python. Use when working with network/graph data structures, analyzing relationships between entities, computing graph algorithms (shortest paths, centrality, clustering), detecting communities, generating synthetic networks, or visualizing network topologies. Applicable to social networks, biological networks, transportation systems, citation networks, and any domain involving pairwise relationships.

Usage: 0
Performance: 1.0

neurokit2

neuroscience

Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.

Usage: 0
Performance: 1.0

neuropixels-analysis

neuroscience

Neuropixels neural recording analysis. Load SpikeGLX/OpenEphys data, preprocess, motion correction, Kilosort4 spike sorting, quality metrics, Allen/IBL curation, AI-assisted visual analysis, for Neuropixels 1.0/2.0 extracellular electrophysiology. Use when working with neural recordings, spike sorting, extracellular electrophysiology, or when the user mentions Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, or unit curation.

Usage: 0
Performance: 1.0

omero-integration

lab_automation

Microscopy data management platform. Access images via Python, retrieve datasets, analyze pixels, manage ROIs/annotations, batch processing, for high-content screening and microscopy workflows.

Usage: 0
Performance: 1.0

OmniPath PTM Interactions

protein_interaction

Query OmniPath for post-translational modification (PTM) interactions integrating PhosphoSitePlus, PhosphoELM, SIGNOR, DEPOD, dbPTM, HPRD and 30+ other sources. Returns enzymes modifying the query gene AND substrates the gene modifies — phosphorylation, ubiquitination, acetylation, SUMOylation, methylation. Critical for tau kinase/phosphatase networks, alpha-synuclein ubiquitination (PINK1/Parkin), LRRK2 substrates, TDP-43/FUS in ALS.

Usage: 0
Performance: 1.0

openFDA Adverse Events

clinical_pharmacology

Query FDA Adverse Event Reporting System (FAERS) via openFDA API for drug safety signals. Returns ranked adverse reactions with report counts and percentages from post-market surveillance. Useful for identifying neurological adverse events, CNS toxicity, ARIA risk, and safety liabilities for drugs being repurposed in neurodegeneration (donepezil, memantine, lecanemab, aducanumab, levodopa, riluzole).

Usage: 0
Performance: 1.0

OpenGWAS PheWAS Associations

genetic_associations

Query the MRC IEU OpenGWAS platform for phenome-wide association study (PheWAS) data. Returns all GWAS traits where a given variant reaches genome-wide significance across 10K+ studies including UK Biobank, FinnGen, and many GWAS consortia. Complements GWAS Catalog by covering unpublished and OpenGWAS-native summary statistics. Useful for characterizing pleiotropic variants like rs429358 (APOE4).

Usage: 0
Performance: 1.0

open-notebook

scientific_comm

Self-hosted, open-source alternative to Google NotebookLM for AI-powered research and document analysis. Use when organizing research materials into notebooks, ingesting diverse content sources (PDFs, videos, audio, web pages, Office documents), generating AI-powered notes and summaries, creating multi-speaker podcasts from research, chatting with documents using context-aware AI, searching across materials with full-text and vector search, or running custom content transformations. Supports 16+ AI providers including OpenAI, Anthropic, Google, Ollama, Groq, and Mistral with complete data privacy through self-hosting.

Usage: 0
Performance: 1.0

Open Targets Disease Gene Scoring

disease_gene_association

Get top-scored target genes for a disease from Open Targets Platform, integrating 14+ evidence types (GWAS, rare genetics, expression, pathways, animal models, literature). Disease-centric: given a disease name, returns ranked gene list with overall association score and per-evidence-type breakdown. Essential for target prioritization debates. Distinct from open_targets_associations (gene→diseases).

Usage: 0
Performance: 1.0

Open Targets Evidence

data_retrieval

Disease associations and therapeutic evidence for a gene from Open Targets Platform, scored across multiple evidence sources.

Usage: 0
Performance: 0.1

Open Targets Safety Liability

drug_safety

Query Open Targets for target safety liability evidence curated from literature, clinical genetics, and animal toxicology. Returns safety events, affected tissues, inhibition effects, and supporting PMIDs. Critical for neurodegeneration drug discovery: CNS off-target effects, gene essentiality, tissue-specific toxicity, and high-risk indications for targets like LRRK2, BACE1, MAPT, CDK5, GSK3B.

Usage: 0
Performance: 1.0

opentrons-integration

lab_automation

Official Opentrons Protocol API for OT-2 and Flex robots. Use when writing protocols specifically for Opentrons hardware with full access to Protocol API v2 features. Best for production Opentrons protocols, official API compatibility. For multi-vendor automation or broader equipment control use pylabrobot.

Usage: 0
Performance: 1.0

optimize-for-gpu

infrastructure

GPU-accelerate Python code using CuPy, Numba CUDA, Warp, cuDF, cuML, cuGraph, KvikIO, cuCIM, cuxfilter, cuVS, cuSpatial, and RAFT. Use whenever the user mentions GPU/CUDA/NVIDIA acceleration, or wants to speed up NumPy, pandas, scikit-learn, scikit-image, NetworkX, GeoPandas, or Faiss workloads. Covers physics simulation, differentiable rendering, mesh ray casting, particle systems (DEM/SPH/fluids), vector/similarity search, GPUDirect Storage file IO, interactive dashboards, geospatial analysis, medical imaging, and sparse eigensolvers. Also use when you see CPU-bound Python code (loops, large arrays, ML pipelines, graph analytics, image processing) that would benefit from GPU acceleration, even if not explicitly requested.

Usage: 0
Performance: 1.0

PanglaoDB Cell Markers

cell_type_annotation

Get canonical cell type marker genes from PanglaoDB scRNA-seq database. Covers microglia, astrocytes, neurons, OPCs, DAM, oligodendrocytes.

Usage: 0
Performance: 1.0

Paperclip Search

literature_search

Search Paperclip MCP for biomedical papers (8M+ across arXiv, bioRxiv, PMC, OpenAlex, OSF). Uses hybrid BM25+embedding search with TL;DR summaries.

Usage: 0
Performance: 1.0

Paper Corpus Ingest

data_retrieval

Ingest a list of paper dicts into the local PaperCorpus cache for persistent storage. Each paper needs at least one ID (pmid, doi, or paper_id).

Usage: 0
Performance: 0.1

Paper Corpus Search

data_retrieval

Search across PubMed, Semantic Scholar, OpenAlex, and CrossRef with unified results and local caching. Use providers param to filter to specific sources.

Usage: 0
Performance: 0.1

Paper Corpus Session

data_retrieval

Start a stateful multi-page search session. Call again with incremented page param to fetch subsequent pages.

Usage: 0
Performance: 0.1

paper-lookup

scientific_comm

Search 10 academic paper databases via REST APIs for research papers, preprints, and scholarly articles. Covers PubMed, PMC (full text), bioRxiv, medRxiv, arXiv, OpenAlex, Crossref, Semantic Scholar, CORE, Unpaywall. Use when searching for papers, citations, DOI/PMID lookups, abstracts, full text, open access, preprints, citation graphs, author search, or any scholarly literature query. Triggers on mentions of any supported database or requests like "find papers on X" or "look up this DOI".

Usage: 0
Performance: 1.0

paperzilla

general

Chat with your agent about projects, recommendations, and canonical papers in Paperzilla. Use when users ask for recent project recommendations, canonical paper details, markdown-based summaries, recommendation feedback, feed export, or Atom feed URLs.

Usage: 0
Performance: 1.0

parallel-web

scientific_comm

All-in-one web toolkit powered by parallel-cli, with a strong emphasis on academic and scientific sources. Use this skill whenever the user needs to search the web, fetch/extract URL content, enrich data with web-sourced fields, or run deep research reports. Covers: web search (fast lookups, research, current info — prioritizing peer-reviewed papers, preprints, and scholarly databases), URL extraction (fetching pages, articles, academic PDFs), bulk data enrichment (adding fields to CSV/lists from the web), and deep research (exhaustive multi-source reports grounded in academic literature). Also handles setup, status checks, and result retrieval. Use this skill for ANY web-related task — even if the user doesn't mention 'parallel' or 'web' explicitly. If they want to look something up, fetch a page, enrich a dataset, investigate a topic, find academic papers, check citations, or review scientific literature, this is the skill to use.

Usage: 0
Performance: 1.0

pathml

clinical

Full-featured computational pathology toolkit. Use for advanced WSI analysis including multiplexed immunofluorescence (CODEX, Vectra), nucleus segmentation, tissue graph construction, and ML model training on pathology data. Supports 160+ slide formats. For simple tile extraction from H&E slides, histolab may be simpler.

Usage: 0
Performance: 1.0

pdf

data_analysis

Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.

Usage: 0
Performance: 1.0

peer-review

scientific_comm

Structured manuscript/grant review with checklist-based evaluation. Use when writing formal peer reviews with specific criteria methodology assessment, statistical validity, reporting standards compliance (CONSORT/STROBE), and constructive feedback. Best for actual review writing, manuscript revision. For evaluating claims/evidence quality use scientific-critical-thinking; for quantitative scoring frameworks use scholar-evaluation.

Usage: 0
Performance: 1.0

pennylane

quantum

Hardware-agnostic quantum ML framework with automatic differentiation. Use when training quantum circuits via gradients, building hybrid quantum-classical models, or needing device portability across IBM/Google/Rigetti/IonQ. Best for variational algorithms (VQE, QAOA), quantum neural networks, and integration with PyTorch/JAX/TensorFlow. For hardware-specific optimizations use qiskit (IBM) or cirq (Google); for open quantum systems use qutip.

Usage: 0
Performance: 1.0

PGS Catalog Polygenic Risk Scores

genetic_associations

Search PGS Catalog (EMBL-EBI) for published polygenic risk score (PRS) models for a disease. Returns multi-SNP scoring models with variant counts, effect weight methods, publication DOI/year, and FTP download links. Covers 47+ Alzheimer disease PRS, 11+ Parkinson disease PRS, and hundreds of cognitive/brain trait models. Complements GWAS tools (single variants) with complete polygenic models ready for individual risk stratification. Essential for precision medicine analyses in neurodegeneration.

Usage: 0
Performance: 1.0

PharmGKB Pharmacogenomics

drug_database

Query PharmGKB for pharmacogenomics drug-gene relationships. Returns clinical annotations linking genetic variants to drug response.

Usage: 0
Performance: 1.0

phylogenetics

research_methodology

Build and analyze phylogenetic trees using MAFFT (multiple alignment), IQ-TREE 2 (maximum likelihood), and FastTree (fast NJ/ML). Visualize with ETE3 or FigTree. For evolutionary analysis, microbial genomics, viral phylodynamics, protein family analysis, and molecular clock studies.

Usage: 0
Performance: 1.0

polars

data_analysis

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.

Usage: 0
Performance: 1.0

polars-bio

bioinformatics

High-performance genomic interval operations and bioinformatics file I/O on Polars DataFrames. Overlap, nearest, merge, coverage, complement, subtract for BED/VCF/BAM/GFF intervals. Streaming, cloud-native, faster bioframe alternative.

Usage: 0
Performance: 1.0

pptx

visualization

Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions \"deck,\" \"slides,\" \"presentation,\" or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.

Usage: 0
Performance: 1.0

pptx-posters

visualization

Create research posters using HTML/CSS that can be exported to PDF or PPTX. Use this skill ONLY when the user explicitly requests PowerPoint/PPTX poster format. For standard research posters, use latex-posters instead. This skill provides modern web-based poster design with responsive layouts and easy visual integration.

Usage: 0
Performance: 1.0

primekg

multi_omics

Query the Precision Medicine Knowledge Graph (PrimeKG) for multiscale biological data including genes, drugs, diseases, phenotypes, and more.

Usage: 0
Performance: 1.0

protocolsio-integration

lab_automation

Integration with protocols.io API for managing scientific protocols. This skill should be used when working with protocols.io to search, create, update, or publish protocols; manage protocol steps and materials; handle discussions and comments; organize workspaces; upload and manage files; or integrate protocols.io functionality into workflows. Applicable for protocol discovery, collaborative protocol development, experiment tracking, lab protocol management, and scientific documentation.

Usage: 0
Performance: 1.0

PubChem Target BioAssays

drug_discovery

Find bioassay-confirmed active compounds against a protein target in PubChem BioAssay. Returns assay IDs linked to the gene and CIDs of active compounds. Useful for identifying chemical probes, drug leads, validated inhibitors, and tool compounds for neurodegeneration target validation (BACE1, CDK5, GSK3B, LRRK2, GBA, HDAC6, PARP1).

Usage: 0
Performance: 1.0

PubMed Search

data_retrieval

Search PubMed for papers by keyword. Returns titles, authors, journals, PMIDs.

Usage: 0
Performance: 0.1

PubTator3 Gene Annotations

literature_annotation

Extract standardized gene, disease, chemical, and variant mentions from PubMed literature using NCBI PubTator3 AI annotation. Resolves synonyms to canonical IDs (NCBI Gene IDs, MeSH disease IDs). Useful for finding papers where specific genes and diseases co-occur, building evidence chains for hypotheses, and disambiguating biomedical entity names.

Usage: 0
Performance: 1.0

pufferlib

ml_ai

High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.

Usage: 0
Performance: 1.0

pydeseq2

general

Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.

Usage: 0
Performance: 1.0

pydicom

clinical

Python library for working with DICOM (Digital Imaging and Communications in Medicine) files. Use this skill when reading, writing, or modifying medical imaging data in DICOM format, extracting pixel data from medical images (CT, MRI, X-ray, ultrasound), anonymizing DICOM files, working with DICOM metadata and tags, converting DICOM images to other formats, handling compressed DICOM data, or processing medical imaging datasets. Applies to tasks involving medical image analysis, PACS systems, radiology workflows, and healthcare imaging applications.

Usage: 0
Performance: 1.0

pyhealth

healthcare_ai

Comprehensive healthcare AI toolkit for developing, testing, and deploying machine learning models with clinical data. This skill should be used when working with electronic health records (EHR), clinical prediction tasks (mortality, readmission, drug recommendation), medical coding systems (ICD, NDC, ATC), physiological signals (EEG, ECG), healthcare datasets (MIMIC-III/IV, eICU, OMOP), or implementing deep learning models for healthcare applications (RETAIN, SafeDrug, Transformer, GNN).

Usage: 0
Performance: 1.0

pylabrobot

lab_automation

Vendor-agnostic lab automation framework. Use when controlling multiple equipment types (Hamilton, Tecan, Opentrons, plate readers, pumps) or needing unified programming across different vendors. Best for complex workflows, multi-vendor setups, simulation. For Opentrons-only protocols with official API, opentrons-integration may be simpler.

Usage: 0
Performance: 1.0

pymatgen

cheminformatics

Materials science toolkit. Crystal structures (CIF, POSCAR), phase diagrams, band structure, DOS, Materials Project integration, format conversion, for computational materials science.

Usage: 0
Performance: 1.0

pymc

ml_ai

Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.

Usage: 0
Performance: 1.0

pymoo

ml_ai

Multi-objective optimization framework. NSGA-II, NSGA-III, MOEA/D, Pareto fronts, constraint handling, benchmarks (ZDT, DTLZ), for engineering design and optimization problems.

Usage: 0
Performance: 1.0

pyopenms

proteomics

Complete mass spectrometry analysis platform. Use for proteomics workflows feature detection, peptide identification, protein quantification, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. Best for proteomics, comprehensive MS data processing. For simple spectral comparison and metabolite ID use matchms.

Usage: 0
Performance: 1.0

pysam

bioinformatics

Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.

Usage: 0
Performance: 1.0

pytdc

general

Therapeutics Data Commons. AI-ready drug discovery datasets (ADME, toxicity, DTI), benchmarks, scaffold splits, molecular oracles, for therapeutic ML and pharmacological prediction.

Usage: 0
Performance: 1.0

pytorch-lightning

ml_ai

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

Usage: 0
Performance: 1.0

pyzotero

general

Interact with Zotero reference management libraries using the pyzotero Python client. Retrieve, create, update, and delete items, collections, tags, and attachments via the Zotero Web API v3. Use this skill when working with Zotero libraries programmatically, managing bibliographic references, exporting citations, searching library contents, uploading PDF attachments, or building research automation workflows that integrate with Zotero.

Usage: 0
Performance: 1.0

qiskit

quantum

IBM quantum computing framework. Use when targeting IBM Quantum hardware, working with Qiskit Runtime for production workloads, or needing IBM optimization tools. Best for IBM hardware execution, quantum error mitigation, and enterprise quantum computing. For Google hardware use cirq; for gradient-based quantum ML use pennylane; for open quantum system simulations use qutip.

Usage: 0
Performance: 1.0

qutip

quantum

Quantum physics simulation library for open quantum systems. Use when studying master equations, Lindblad dynamics, decoherence, quantum optics, or cavity QED. Best for physics research, open system dynamics, and educational simulations. NOT for circuit-based quantum computing—use qiskit, cirq, or pennylane for quantum algorithms and hardware execution.

Usage: 0
Performance: 1.0

rdkit

cheminformatics

Cheminformatics toolkit for fine-grained molecular control. SMILES/SDF parsing, descriptors (MW, LogP, TPSA), fingerprints, substructure search, 2D/3D generation, similarity, reactions. For standard workflows with simpler interface, use datamol (wrapper around RDKit). Use rdkit for advanced control, custom sanitization, specialized algorithms.

Usage: 0
Performance: 1.0

Reactome Pathways

data_retrieval

Look up biological pathways a gene participates in, from Reactome.

Usage: 0
Performance: 0.1

Reactome Pathway Search

pathway_analysis

Search Reactome for pathways by concept name and return constituent genes for each pathway. Complements the gene→pathway direction by going pathway-name→genes. Essential for building mechanistic gene sets for neurodegeneration analyses: mitophagy, tau clearance, NLRP3 inflammasome, amyloid processing, autophagy, endosomal sorting, complement activation, mTOR signaling.

Usage: 0
Performance: 1.0

research-grants

scientific_comm

Write competitive research proposals for NSF, NIH, DOE, DARPA, and Taiwan NSTC. Agency-specific formatting, review criteria, budget preparation, broader impacts, significance statements, innovation narratives, and compliance with submission requirements.

Usage: 0
Performance: 1.0

research-lookup

scientific_comm

Look up current research information using parallel-cli search (primary, fast web search), the Parallel Chat API (deep research), or Perplexity sonar-pro-search (academic paper searches). Automatically routes queries to the best backend. Use for finding papers, gathering research data, and verifying scientific information.

Usage: 0
Performance: 1.0

rowan

cheminformatics

Rowan is a cloud-native molecular modeling and medicinal-chemistry workflow platform with a Python API. Use for pKa and macropKa prediction, conformer and tautomer ensembles, docking and analogue docking, protein-ligand cofolding, MSA generation, molecular dynamics, permeability, descriptor workflows, and related small-molecule or protein modeling tasks. Ideal for programmatic batch screening, multi-step chemistry pipelines, and workflows that would otherwise require maintaining local HPC/GPU infrastructure.

Usage: 0
Performance: 1.0

scanpy

bioinformatics

Standard single-cell RNA-seq analysis pipeline. Use for QC, normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression, and visualization. Best for exploratory scRNA-seq analysis with established workflows. For deep learning models use scvi-tools; for data format questions use anndata.

Usage: 0
Performance: 1.0

scholar-evaluation

scientific_comm

Systematically evaluate scholarly work using the ScholarEval framework, providing structured assessment across research quality dimensions including problem formulation, methodology, analysis, and writing with quantitative scoring and actionable feedback.

Usage: 0
Performance: 1.0

scientific-brainstorming

scientific_comm

Creative research ideation and exploration. Use for open-ended brainstorming sessions, exploring interdisciplinary connections, challenging assumptions, or identifying research gaps. Best for early-stage research planning when you do not have specific observations yet. For formulating testable hypotheses from data use hypothesis-generation.

Usage: 0
Performance: 1.0

scientific-critical-thinking

scientific_comm

Evaluate scientific claims and evidence quality. Use for assessing experimental design validity, identifying biases and confounders, applying evidence grading frameworks (GRADE, Cochrane Risk of Bias), or teaching critical analysis. Best for understanding evidence quality, identifying flaws. For formal peer review writing use peer-review.

Usage: 0
Performance: 1.0

scientific-schematics

visualization

Create publication-quality scientific diagrams using Nano Banana 2 AI with smart iterative refinement. Uses Gemini 3.1 Pro Preview for quality review. Only regenerates if quality is below threshold for your document type. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.

Usage: 0
Performance: 1.0

scientific-slides

visualization

Build slide decks and presentations for research talks. Use this for making PowerPoint slides, conference presentations, seminar talks, research presentations, thesis defense slides, or any scientific talk. Provides slide structure, design templates, timing guidance, and visual validation. Works with PowerPoint and LaTeX Beamer.

Usage: 0
Performance: 1.0

scientific-visualization

visualization

Meta-skill for publication-ready figures. Use when creating journal submission figures requiring multi-panel layouts, significance annotations, error bars, colorblind-safe palettes, and specific journal formatting (Nature, Science, Cell). Orchestrates matplotlib/seaborn/plotly with publication styles. For quick exploration use seaborn or plotly directly.

Usage: 0
Performance: 1.0

scientific-writing

scientific_comm

Core skill for the deep research and writing tool. Write scientific manuscripts in full paragraphs (never bullet points). Use two-stage process with (1) section outlines with key points using research-lookup then (2) convert to flowing prose. IMRAD structure, citations (APA/AMA/Vancouver), figures/tables, reporting guidelines (CONSORT/STROBE/PRISMA), for research papers and journal submissions.

Usage: 0
Performance: 1.0

scikit-bio

bioinformatics

Biological data toolkit. Sequence analysis, alignments, phylogenetic trees, diversity metrics (alpha/beta, UniFrac), ordination (PCoA), PERMANOVA, FASTA/Newick I/O, for microbiome analysis.

Usage: 0
Performance: 1.0

scikit-learn

ml_ai

Machine learning in Python with scikit-learn. Use when working with supervised learning (classification, regression), unsupervised learning (clustering, dimensionality reduction), model evaluation, hyperparameter tuning, preprocessing, or building ML pipelines. Provides comprehensive reference documentation for algorithms, preprocessing techniques, pipelines, and best practices.

Usage: 0
Performance: 1.0

scikit-survival

ml_ai

Comprehensive toolkit for survival analysis and time-to-event modeling in Python using scikit-survival. Use this skill when working with censored survival data, performing time-to-event analysis, fitting Cox models, Random Survival Forests, Gradient Boosting models, or Survival SVMs, evaluating survival predictions with concordance index or Brier score, handling competing risks, or implementing any survival analysis workflow with the scikit-survival library.

Usage: 0
Performance: 1.0

scvelo

bioinformatics

RNA velocity analysis with scVelo. Estimate cell state transitions from unspliced/spliced mRNA dynamics, infer trajectory directions, compute latent time, and identify driver genes in single-cell RNA-seq data. Complements Scanpy/scVI-tools for trajectory inference.

Usage: 0
Performance: 1.0

scvi-tools

bioinformatics

Deep generative models for single-cell omics. Use when you need probabilistic batch correction (scVI), transfer learning, differential expression with uncertainty, or multi-modal integration (TOTALVI, MultiVI). Best for advanced modeling, batch effects, multimodal data. For standard analysis pipelines use scanpy.

Usage: 0
Performance: 1.0

seaborn

visualization

Statistical visualization with pandas integration. Use for quick exploration of distributions, relationships, and categorical comparisons with attractive defaults. Best for box plots, violin plots, pair plots, heatmaps. Built on matplotlib. For interactive plots use plotly; for publication styling use scientific-visualization.

Usage: 0
Performance: 1.0

shap

ml_ai

Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model.

Usage: 0
Performance: 1.0

simpy

engineering

Process-based discrete-event simulation framework in Python. Use this skill when building simulations of systems with processes, queues, resources, and time-based events such as manufacturing systems, service operations, network traffic, logistics, or any system where entities interact with shared resources over time.

Usage: 0
Performance: 1.0

stable-baselines3

ml_ai

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

Usage: 0
Performance: 1.0

statistical-analysis

data_analysis

Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.

Usage: 0
Performance: 1.0

statsmodels

ml_ai

Statistical models library for Python. Use when you need specific model classes (OLS, GLM, mixed models, ARIMA) with detailed diagnostics, residuals, and inference. Best for econometrics, time series, rigorous inference with coefficient tables. For guided statistical test selection with APA reporting use statistical-analysis.

Usage: 0
Performance: 1.0

STRING Functional Network

network_analysis

Query STRING DB for a multi-gene functional interaction network with per-channel evidence scores: coexpression (escore), experimental binding/co-IP (ascore), text-mining (tscore), curated database (dscore), neighborhood (nscore), and gene fusion (fscore). Unlike the basic STRING protein interactions tool, this returns the complete evidence breakdown for every pairwise link so you can distinguish mechanistic (experimental) from correlative (coexpression/text-mining) evidence. Essential for assessing gene module cohesion in AD (TREM2/TYROBP/SYK/PLCG2), PD (PINK1/PARK2/LRRK2/SNCA), and ALS/FTD (TDP-43/FUS/C9orf72) networks.

Usage: 0
Performance: 1.0

STRING Protein Interactions

data_retrieval

Find physical protein-protein interactions from the STRING database. Enter 2+ gene symbols.

Usage: 0
Performance: 0.1

sympy

engineering

Use this skill when working with symbolic mathematics in Python. This skill should be used for symbolic computation tasks including solving equations algebraically, performing calculus operations (derivatives, integrals, limits), manipulating algebraic expressions, working with matrices symbolically, physics calculations, number theory problems, geometry computations, and generating executable code from mathematical expressions. Apply this skill when the user needs exact symbolic results rather than numerical approximations, or when working with mathematical formulas that contain variables and parameters.

Usage: 0
Performance: 1.0

tiledbvcf

bioinformatics

Efficient storage and retrieval of genomic variant data using TileDB. Scalable VCF/BCF ingestion, incremental sample addition, compressed storage, parallel queries, and export capabilities for population genomics.

Usage: 0
Performance: 1.0

timesfm-forecasting

ml_ai

Zero-shot time series forecasting with Google's TimesFM foundation model. Use for any univariate time series (sales, sensors, energy, vitals, weather) without training a custom model. Supports CSV/DataFrame/array inputs with point forecasts and prediction intervals. Includes a preflight system checker script to verify RAM/GPU before first use.

Usage: 0
Performance: 1.0

torchdrug

cheminformatics

PyTorch-native graph neural networks for molecules and proteins. Use when building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, retrosynthesis. For pre-trained models and diverse featurizers use deepchem; for benchmark datasets use pytdc.

Usage: 0
Performance: 1.0

torch-geometric

ml_ai

Guide for building Graph Neural Networks with PyTorch Geometric (PyG). Use this skill whenever the user asks about graph neural networks, GNNs, node classification, link prediction, graph classification, message passing networks, heterogeneous graphs, neighbor sampling, or any task involving torch_geometric / PyG. Also trigger when you see imports from torch_geometric, or the user mentions graph convolutions (GCN, GAT, GraphSAGE, GIN), graph data structures, or working with relational/network data. Even if the user just says 'graph learning' or 'geometric deep learning', use this skill.

Usage: 0
Performance: 1.0

Train Model Version

model_training

Fine-tune a model artifact on a dataset artifact via GPU sandbox. Produces a new model version with parent lineage, code commit SHA, and eval metrics captured. Use dry_run=True to validate and estimate cost before launching.

Usage: 0
Performance: 1.0

transformers

ml_ai

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

Usage: 0
Performance: 1.0

treatment-plans

clinical

Generate concise (3-4 page), focused medical treatment plans in LaTeX/PDF format for all clinical specialties. Supports general medical treatment, rehabilitation therapy, mental health care, chronic disease management, perioperative care, and pain management. Includes SMART goal frameworks, evidence-based interventions with minimal text citations, regulatory compliance (HIPAA), and professional formatting. Prioritizes brevity and clinical actionability.

Usage: 0
Performance: 1.0

umap-learn

ml_ai

UMAP dimensionality reduction. Fast nonlinear manifold learning for 2D/3D visualization, clustering preprocessing (HDBSCAN), supervised/parametric UMAP, for high-dimensional data.

Usage: 0
Performance: 1.0

UniChem Compound Xrefs

compound_annotation

Cross-reference a compound across 40+ chemistry and pharmacology databases via EBI UniChem. Given a compound name or InChI key, returns the compound's IDs in ChEMBL, DrugBank, PubChem, BindingDB, DrugCentral, KEGG, ChEBI, and more. Essential for integrating multi-database drug evidence and resolving compound identity across sources.

Usage: 0
Performance: 1.0

UniProt Protein Info

data_retrieval

Comprehensive protein annotation from UniProt/Swiss-Prot: function, domains, subcellular location, disease associations.

Usage: 0
Performance: 0.1

usfiscaldata

database_access

Query the U.S. Treasury Fiscal Data API for federal financial data including national debt, government spending, revenue, interest rates, exchange rates, and savings bonds. Access 54 datasets and 182 data tables with no API key required. Use when working with U.S. federal fiscal data, national debt tracking (Debt to the Penny), Daily Treasury Statements, Monthly Treasury Statements, Treasury securities auctions, interest rates on Treasury securities, foreign exchange rates, savings bonds, or any U.S. government financial statistics.

Usage: 0
Performance: 1.0

vaex

data_analysis

Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that do not fit in memory.

Usage: 0
Performance: 1.0

venue-templates

scientific_comm

Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.

Usage: 0
Performance: 1.0

what-if-oracle

scientific_comm

Run structured What-If scenario analysis with multi-branch possibility exploration. Use this skill when the user asks speculative questions like "what if...", "what would happen if...", "what are the possibilities", "explore scenarios", "scenario analysis", "possibility space", "what could go wrong", "best case / worst case", "risk analysis", "contingency planning", "strategic options", or any question about uncertain futures. Also trigger when the user faces a fork-in-the-road decision, wants to stress-test an idea, or needs to think through consequences before committing.

Usage: 0
Performance: 1.0

xlsx

data_analysis

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like \"the xlsx in my downloads\") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

Usage: 0
Performance: 1.0

zarr-python

bioinformatics

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

Usage: 0
Performance: 1.0