[Forge] Implement conda environment activation for analysis tool execution

← All Specs

[Forge] Implement conda environment activation for analysis tool execution

Quest: Analysis Sandboxing Priority: P3 Status: open

Goal

Ensure the conda environments defined in docker/*/environment.yml are actually created and used when tools execute. Currently engine.py maps tools to environments but the environments may not exist. Verify and fix the full tool→environment→execution chain.

This should culminate in real analysis entrypoints using those environments,
not only a mapping table. The critical proof is that a representative
mechanistic analysis can relaunch into the correct runtime and persist
reproducible outputs against the live SciDEX DB.

Acceptance Criteria

☑ All 5 specialized conda environments created (singlecell, genomics, proteomics, cheminformatics, dataml)
☑ engine.py correctly activates the right conda env for each tool
☐ Tool execution in conda env verified with a test suite
☑ Missing packages installed in correct environments
☑ Environment creation script is idempotent and documented
☑ At least one real-data analysis runner uses the runtime path successfully

Approach

  • Check which conda environments exist: conda env list ✅
  • Create missing environments from docker/*/environment.yml ✅ (all existed; setup_conda_envs.sh created for future use)
  • Test engine.py tool routing: does each tool land in the right env? ✅ (verified via engine.py list)
  • Fix any broken imports (e.g., scanpy in forge-singlecell) ✅ (scanpy 1.11.5, pysam 0.23.3, rdkit 2026.03.1 verified)
  • Write setup_conda_envs.sh for reproducible environment creation ✅
  • Add to AGENTS.md as setup requirement ✅
  • Dependencies

    • forge/runtime.py — already implemented by prior agents
    • engine.py — already implemented by prior agents

    Dependents

    • forge/computational_analysis.py — uses run_python_script for mechanistic DE analysis
    • scripts/archive/oneoff_scripts/run_mechanistic_de_analysis.py — analysis runner via run_python_script

    Work Log

    • 2026-04-20: Verified all 6 Forge conda environments exist and are available (forge-singlecell, forge-genomics, forge-proteomics, forge-cheminformatics, forge-dataml, forge-base). Verified engine.py correctly routes tools to environments via RUNTIME_MAP and forge.runtime. Verified key packages (scanpy 1.11.5, pysam 0.23.3, rdkit 2026.03.1) are installed and importable. Created setup_conda_envs.sh (idempotent, --check/--force-update modes). Added setup_conda_envs.sh to AGENTS.md under Development Standards. Remaining gap: tool execution test suite for conda runtime (not HTTP API tools).

    Tasks using this spec (1)
    [Forge] Implement conda environment activation for analysis
    File: 5e9e7f2e18dc_forge_implement_conda_environment_activ_spec.md
    Modified: 2026-04-25 23:40
    Size: 2.9 KB