model_building as a first-class analysis step type, enabling analyses to construct, train, and evaluate scientific models as part of their execution pipeline. Each model-building step produces a versioned model artifact.{
"step_type": "model_building",
"model_spec": {
"model_family": "biophysical",
"template": "ode_system",
"input_datasets": ["dataset-allen_brain-SEA-AD"],
"input_parameters_from_kg": true,
"hyperparameters": {
"solver_method": "RK45",
"t_span": [0, 100],
"fitting_method": "least_squares"
},
"evaluation_metrics": ["rmse", "aic", "parameter_sensitivity"],
"output_artifact_type": "model"
}
}input_parameters_from_kg: query KG for relevant parameters/rate constantsregister_model()model_building to the valid step_type enumModelBuildingStep handler in the analysis execution engine{
"analysis_id": "SDA-2026-04-05-xxx",
"steps": [
{"step_type": "data_retrieval", "status": "completed", "output": "..."},
{"step_type": "model_building", "status": "completed",
"output": {
"model_artifact_id": "model-biophys-microglia-v1",
"evaluation": {"rmse": 0.023, "aic": -450},
"parameter_sensitivity": {"k_phago": 0.85, "k_clear": 0.42}
}},
{"step_type": "visualization", "status": "completed", "output": "figure-tc-001"}
]
}model_building recognized as valid analysis step typeChanges made:
scidex/forge/analysis_steps.py (new file):StepType enum with values: data_retrieval, statistical_test, visualization, model_buildingAnalysisModelSpec dataclass with fields: model_family, template, input_datasets, input_parameters_from_kg, hyperparameters, evaluation_metrics, output_artifact_type, tests_hypothesis_id, analysis_idAnalysisModelSpec.validate() — validates model_family against {biophysical, deep_learning, statistical}AnalysisModelSpec.to_dict() / from_dict() — serializationStepResult dataclass — result of executing a stepModelBuildingStepHandler — executes model_building steps, registers models via register_model()execute_model_building_step() — top-level function for running model_building stepsvalidate_analysis_steps() — validates a list of analysis stepsscidex/forge/__init__.py — exports all analysis_steps public APItests/test_analysis_steps.py (19 tests):TestStepType — enum values and validity checksTestAnalysisModelSpec — validation, roundtrip, all three model familiesTestStepResult — serializationTestModelBuildingStepHandler — initialization, framework mappingTestValidateAnalysisSteps — step validationTestExecuteModelBuildingStep — execution flowIntegration points:
scidex.atlas.artifact_registry.register_model() (from MODL0001 dependency)scidex.atlas.artifact_registry.record_processing_step() for provenancescidex.atlas.artifact_registry.get_artifact() to fetch input datasetsNote: KG parameter extraction (input_parameters_from_kg=true) is stubbed — actual KG integration would be done in dependent tasks (frg-mb-02-BIOP, frg-mb-03-DL01).
{
"requirements": {
"coding": 7,
"reasoning": 6
}
}