Quest: Analysis Sandboxing Priority: P4 Status: completed
Create an abstract Executor interface that the orchestrator uses to run analyses. Implement LocalExecutor (cgroup-based, current) and stub out ContainerExecutor (Docker) and CloudExecutor (AWS Batch). This ensures the sandboxing design supports future scaling without rewriting the orchestrator.
The interface should also cover the intermediate step SciDEX needs now:
explicit named local runtimes backed by existing conda environments, with
relaunch, tempdir isolation, and execution logging. Container and cloud
executors should extend that contract rather than bypass it.
scidex/senate/cgroup_isolation.py — cgroup-based isolation (Senate layer)docker Python package (for ContainerExecutor)boto3 AWS SDK (for CloudExecutor)scidex_orchestrator.py — uses Executor interface for analysis executionThe Executor interface provides a pluggable abstraction for running scientific analyses with isolation and resource controls. It allows the orchestrator to switch between execution backends (local cgroup, Docker containers, AWS Batch) without changing analysis logic.
class Executor(ABC):
def run_analysis(self, spec: AnalysisSpec) -> str:
"""Launch analysis, return run_id."""
...
def check_status(self, run_id: str) -> RunStatus:
"""Check status of a run."""
...
def kill(self, run_id: str) -> bool:
"""Terminate a running analysis."""
...
def get_logs(self, run_id: str) -> str:
"""Retrieve execution logs."""
...AnalysisSpec defines the analysis (command, runtime, resource limits, timeout)run_analysis() launches the analysis and returns a run_idcheck_status() polls for completionget_logs() fetches stdout/stderrkill() terminates runaway analysesAll executors support named runtimes backed by conda environments:
python3 — default Python runtimer-base — R statistical computingjulia — Julia numerical computingAnalysisSpec.runtime and the executor maps it to the appropriate environment activation.AnalysisSpec defines resource constraints:
memory_limit_mb — max RAM in MBcpu_quota — CPU time quota (fraction of core)pids_limit — max number of processestimeout_seconds — wall-clock timeoutLocalExecutor writes execution logs to log_path (if provided) with:
Set via EXECUTOR_TYPE environment variable:
EXECUTOR_TYPE=local # Default, cgroup-based
EXECUTOR_TYPE=container # Docker-based (requires docker package)
EXECUTOR_TYPE=cloud # AWS Batch (requires boto3)The pluggable interface ensures no changes to orchestrator logic when migrating between phases.
scidex/forge/executor.py with:Executor base class with run_analysis(), check_status(), kill(), get_logs() methodsAnalysisSpec dataclass for passing run parametersRunResult dataclass for returning resultsRunStatus enum (PENDING, RUNNING, COMPLETED, FAILED, KILLED, UNKNOWN)ExecutorType enum (LOCAL, CONTAINER, CLOUD)LocalExecutor wrapping cgroup_isolation.isolated_run(), with named runtime (conda env) support and audit loggingContainerExecutor stub using Docker SDK patterns (requires docker package)CloudExecutor stub using boto3 Batch patterns (requires boto3 package)get_executor() factory function reading EXECUTOR_TYPE from env.env.example documenting configuration optionsscidex/forge/executor.py contains full implementation: Executor ABC, LocalExecutor, ContainerExecutor, CloudExecutor, get_executor() factoryscidex/agora/scidex_orchestrator.py imports and uses Executor interface (lines 82-88, 554-562, 661-694)get_executor() with EXECUTOR_TYPE env var for pluggable backendsgit ls-tree origin/main scidex/forge/executor.py → blob 6eca5773394832d7a6022752594bddfbedcf2173{
"requirements": {
"coding": 8,
"safety": 9
}
}