[Forge] Deep learning model template: training pipeline, evaluation, checkpointing done

← Forge
Create a DL model template for analyses: define architecture spec (layers, activations), training loop with checkpointing, evaluation metrics (loss, accuracy, domain-specific), and artifact registration on completion. Each checkpoint is a versioned model artifact. Support common frameworks (PyTorch preferred). Template handles: data loading from registered datasets, train/val split, early stopping, metric logging. Depends on: frg-mb-01-ANLX.

Completion Notes

Auto-completed by supervisor after successful deploy to main

Git Commits (12)

Squash merge: orchestra/task/frg-mb-0-extend-analysis-framework-to-include-mod (1 commits)2026-04-25
[Forge] Add model_building step type and AnalysisModelSpec2026-04-25
[Forge] api.py model registry catalog and compare view [task:frg-mb-04-MREG]2026-04-25
[Forge] Biophysical model template: remove retired sqlite3 import; update work log [task:frg-mb-02-BIOP]2026-04-25
Squash merge: orchestra/task/frg-mb-0-deep-learning-model-template-training-pi (1 commits)2026-04-25
[Forge] api.py model registry catalog and compare view [task:frg-mb-04-MREG]2026-04-25
[Forge] Biophysical model template: remove retired sqlite3 import; update work log [task:frg-mb-02-BIOP]2026-04-25
[Forge] Biophysical model template: remove retired sqlite3 import; update work log [task:frg-mb-02-BIOP]2026-04-25
Squash merge: orchestra/task/frg-mb-0-deep-learning-model-template-training-pi (1 commits)2026-04-25
Squash merge: orchestra/task/frg-mb-0-deep-learning-model-template-training-pi (1 commits)2026-04-25
[Forge] Add deep learning model template pipeline [task:frg-mb-03-DL01]2026-04-25
[Forge] Add deep learning model template pipeline [task:frg-mb-03-DL01]2026-04-25
Spec File

[Forge] Deep learning model template: training pipeline, evaluation, checkpointing

Goal

Create a reusable template for building deep learning models within SciDEX analyses. The template standardizes the training pipeline so that any analysis can define a model architecture, point it at a registered dataset, train it, and produce versioned model artifacts with full provenance.

Template Components

1. Architecture Specification

Declarative model definition (not arbitrary code — keeps it reproducible):

class DLModelSpec:
    architecture: str  # "mlp", "cnn", "transformer", "gnn"
    input_dim: int
    output_dim: int
    hidden_dims: list[int]  # [512, 256, 128]
    activation: str  # "relu", "gelu", "silu"
    dropout: float  # 0.1
    task_type: str  # "classification", "regression", "embedding"

2. Training Pipeline

class DLTrainingPipeline:
    def __init__(self, model_spec, training_config):
        self.model = build_model(model_spec)  # PyTorch model
        self.config = training_config
    
    def load_data(self, dataset_artifact_id):
        """Load data from registered tabular dataset artifact."""
        ...
    
    def train(self, train_loader, val_loader):
        """Training loop with:
        - Configurable optimizer (Adam, AdamW, SGD)
        - Learning rate scheduling
        - Early stopping (patience-based)
        - Periodic checkpointing → versioned model artifacts
        - Metric logging per epoch
        """
        ...
    
    def evaluate(self, test_loader):
        """Compute evaluation metrics based on task_type:
        - Classification: accuracy, precision, recall, F1, AUC-ROC
        - Regression: MSE, MAE, R-squared, Spearman correlation
        - All: confusion matrix, prediction distribution
        """
        ...

3. Checkpointing as Versioning

Each checkpoint during training creates a new version of the model artifact:
  • Checkpoint at epoch N → model artifact version N
  • Best model (by val loss) gets version_tag="best"
  • Final model gets version_tag="final"
  • Training can be resumed from any checkpoint version

4. Artifact Registration

def register_training_run(pipeline, dataset_id, analysis_id):
    """Register completed training as model artifact."""
    model_id = register_model(
        model_family="deep_learning",
        title=f"DL-{pipeline.spec.architecture}-{dataset_id}",
        metadata={
            "architecture": pipeline.spec.architecture,
            "parameter_count": count_parameters(pipeline.model),
            "training_config": pipeline.config,
            "training_metrics": pipeline.metrics_history,
            "evaluation_metrics": pipeline.eval_results,
            "framework": "pytorch",
            "checkpoint_path": pipeline.best_checkpoint_path
        },
        trained_on_dataset_id=dataset_id,
        produced_by_analysis_id=analysis_id
    )
    return model_id

Scope Boundaries

  • Supports MLP, CNN, and simple transformer architectures initially
  • PyTorch only (most flexible, good for scientific computing)
  • CPU training is fine for demo scale; GPU support is future work
  • No distributed training — single-machine only
  • No hyperparameter search (that's a separate future task)

Acceptance Criteria

☐ DLModelSpec supports MLP, CNN, and transformer architectures
☐ Training pipeline handles data loading from registered datasets
☐ Configurable optimizer, LR schedule, and early stopping
☐ Checkpoints saved as versioned model artifacts
☐ Best and final models tagged
☐ Evaluation metrics computed based on task type
☐ All artifacts registered with provenance links
☐ Template invocable from analysis model_building step
☐ Work log updated with timestamped entry

Dependencies

  • frg-mb-01-ANLX (model-building step type in analysis framework)

Dependents

  • None currently (future demo tasks may exercise this)

Work Log

2026-04-25 20:05 PT — Codex

  • Reviewed the task spec, related Forge specs (frg-mb-01-ANLX, frg-mb-04-MREG), and current model-artifact registry code.
  • Confirmed the task is still relevant: there is an existing biophysical template, but no deep learning model template in scidex/forge/.
  • Implementation approach:
1. Add a reusable PyTorch-based deep learning template module under scidex/forge/ with declarative architecture specs for MLP, CNN, and transformer models.
2. Support loading feature/target data from registered dataset artifacts or inline tabular payloads, then build train/val/test splits and data loaders.
3. Implement training, evaluation, early stopping, and checkpoint persistence with versioned model-artifact registration for each checkpoint plus best/final tagging.
4. Add focused pytest coverage for architecture construction, dataset loading, metrics, and checkpoint registration behavior.

2026-04-25 20:34 PT — Codex

  • Added scidex/forge/deep_learning_model_template.py and a legacy import shim for a reusable PyTorch-based deep learning template.
  • Implemented declarative MLP/CNN/transformer model specs, registered-dataset loading, train/val/test splitting, optimizer and scheduler configuration, early stopping, checkpoint persistence, checkpoint resume, evaluation metrics, and versioned model-artifact registration.
  • Added tests/test_deep_learning_model_template.py covering architecture construction plus an end-to-end registered-dataset training run with checkpoint artifact registration.
  • Verification: python3 -m py_compile scidex/forge/deep_learning_model_template.py tests/test_deep_learning_model_template.py and pytest -q tests/test_deep_learning_model_template.py (4 passed).

2026-04-25 20:52 PT — Codex

  • Tightened validation so unsupported task_type="embedding" fails fast during spec validation instead of later during training/evaluation.
  • Added pytest coverage for the unsupported embedding guard to prevent silent regressions while only classification/regression remain implemented.
  • Re-ran verification: python3 -m py_compile scidex/forge/deep_learning_model_template.py tests/test_deep_learning_model_template.py deep_learning_model_template.py and pytest -q tests/test_deep_learning_model_template.py (5 passed).

2026-04-25 21:18 PT — Codex

  • Added a dedicated pytest covering run_deep_learning_model_step(...) so the template's analysis-facing entrypoint is exercised end-to-end, not just the lower-level pipeline methods.
  • Re-verified the task boundary against frg-mb-01-ANLX: the upstream analysis-engine dispatcher is still separate, but this task now ships the callable deep-learning template entrypoint that the model-building step can invoke once wired.
  • Verification: python3 -m py_compile scidex/forge/deep_learning_model_template.py tests/test_deep_learning_model_template.py deep_learning_model_template.py and pytest -q tests/test_deep_learning_model_template.py (6 passed).

2026-04-25 21:06 PDT — Codex

  • Performed a fresh staleness review against current origin/main and frg-mb-01-ANLX; confirmed the deep-learning template is still absent on main and this task remains valid.
  • Re-read the implementation against current artifact-registry patterns and verified the template still fits the upstream model-building-step contract as a callable deep-learning entrypoint.
  • Verification repeated on the current worktree state: python3 -m py_compile scidex/forge/deep_learning_model_template.py tests/test_deep_learning_model_template.py deep_learning_model_template.py and pytest -q tests/test_deep_learning_model_template.py (6 passed).

Sibling Tasks in Quest (Forge) ↗

Task Dependencies

↓ Referenced by (downstream)