Models API — CV Training, Optuna Tuning, and AutoML

The Models API manages ML model definitions and exposes the full training pipeline. A model definition records the model_type, family, and parameters; the training endpoints drive time-series cross-validation using rolling or expanding window splits, Optuna hyperparameter search across configurable trial budgets, and multi-candidate AutoML evaluation that produces a ranked leaderboard — all backed by S3/MinIO artifact storage and MLflow experiment tracking via mlflow_run_id. Model families span statistical (arima, garch), machine learning (xgboost, lightgbm, catboost, random_forest), deep learning (lstm), and ensemble methods.

Model CRUD

Create a Model Definition

POST

string

/api/v1/models

Registers a new model definition. The model is not trained at this point — use the training endpoints to fit it against a feature dataset.

Request Body

name

string

required

Human-readable name, e.g. "XGBoost v1". Maximum 255 characters.

model_type

string

required

Algorithm identifier, e.g. "xgboost", "lstm", "arima". This is the sub-type within the family.

family

string

required

Model family. One of: "statistical", "machine_learning", "deep_learning", "ensemble".

parameters

object

Initial hyperparameters passed to the model plugin constructor, e.g. {"n_estimators": 100, "max_depth": 5}. Defaults to {}.

Response — `ModelRead`

string (UUID)

Unique model identifier.

name

string

Model name.

model_type

string

Algorithm sub-type.

family

string

Model family.

parameters

object

Stored hyperparameters.

version

integer

Optimistic-lock version counter, incremented on each update.

mlflow_run_id

string | null

MLflow run ID linked after training. null before first train.

artifact_uri

string | null

S3 URI to the serialized model artifact. null before first train.

metrics

object

Latest CV metric summary dict stored after training. Empty before first train.

created_at

string (datetime)

ISO 8601 creation timestamp.

updated_at

string (datetime)

ISO 8601 last-updated timestamp.

curl -X POST http://localhost:8000/api/v1/models \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "XGBoost v1",
    "model_type": "xgboost",
    "family": "machine_learning",
    "parameters": {"n_estimators": 100, "max_depth": 5}
  }'

List Model Definitions

GET

string

/api/v1/models

Returns a paginated list of all model definitions.

Query Parameters

skip

integer

Number of records to skip. Default 0.

limit

integer

Maximum records to return. Default 100.

Response: Array of ModelRead objects.

curl "http://localhost:8000/api/v1/models?skip=0&limit=50"

Get a Model Definition

GET

string

/api/v1/models/{model_id}

Fetches a single model definition by UUID.

model_id

string (UUID)

required

UUID of the model to retrieve.

Response: ModelRead object. Errors: 404 Model not found.

curl http://localhost:8000/api/v1/models/7c9e6679-7425-40de-944b-e07fc1f90ae7

Update a Model Definition

PATCH

string

/api/v1/models/{model_id}

Partially updates a model definition. All fields are optional; only supplied fields are changed.

Request Body (all optional)

name

string

Updated name.

model_type

string

Updated algorithm type.

family

string

Updated model family.

parameters

object

Replacement hyperparameter dict (full replace, not merge).

mlflow_run_id

string

Link or update the MLflow run ID.

artifact_uri

string

Override the S3 artifact URI.

metrics

object

Override the stored metrics dict.

Response: Updated ModelRead object. Errors: 404 Model not found.

Delete a Model Definition

DELETE

string

/api/v1/models/{model_id}

Permanently removes a model definition from the database.

Deleting a model definition does not remove the MLflow run or the S3 artifact. Remove those separately through MLflow or your object-store management tooling if you need to reclaim storage.

Response: 204 No Content Errors: 404 Model not found.

curl -X DELETE http://localhost:8000/api/v1/models/7c9e6679-7425-40de-944b-e07fc1f90ae7

Training

Train with Time-Series Cross-Validation

POST

string

/api/v1/models/{model_id}/train

Assembles training data from one or more FeatureDataset records, aligns them to a forward-return target at target_horizon bars, runs time-series cross-validation, serializes the fitted model to S3, and records the MLflow run. The call is synchronous and blocks until training completes.

Path Parameter

model_id

string (UUID)

required

UUID of the model definition to train.

Request Body — `TrainRequest`

dataset

DatasetSpec

required

Show DatasetSpec fields

feature_ids

array of UUID

required

List of feature definition UUIDs whose latest datasets will be joined to form the feature matrix X.

symbol

string

required

Ticker symbol, e.g. "AAPL".

timeframe

string

Bar size. Default "1d".

start_date

string

required

ISO 8601 start of the training window.

end_date

string

required

ISO 8601 end of the training window.

target_horizon

integer

Number of bars ahead for the forward-return target label. Default 1.

CVConfig

Show CVConfig fields

method

string

Cross-validation strategy. "rolling" (fixed-size train window slides forward) or "expanding" (train window grows from the start). Default "rolling".

n_splits

integer

Number of CV folds. Default 5.

test_size

float

Fraction of the total dataset assigned to each fold’s test split. Default 0.15.

min_train_size

float

Minimum fraction of data required in the training portion of the first fold. Default 0.2.

Response — `TrainResponse`

model_id

string (UUID)

UUID of the trained model (same as path parameter).

artifact_uri

string | null

S3 path to the serialized model artifact written after training.

cv_metrics

object

Fold-level and aggregate cross-validation metrics, including per-fold MSE, MAE, and directional accuracy, plus summary statistics (mean, std).

n_train_rows

integer

Total number of aligned rows in the assembled training frame (after feature warm-up period and target shift).

feature_columns

array of strings

Ordered list of column names in the feature matrix X, useful for reproducing the exact input schema.

Errors:

404 Model not found or 404 Unknown feature id(s) — if the model or any feature ID is missing.
422 Unprocessable Entity — if fewer than 30 aligned rows remain after feature warm-up and target alignment.

curl -X POST http://localhost:8000/api/v1/models/MODEL_UUID/train \
  -H 'Content-Type: application/json' \
  -d '{
    "dataset": {
      "feature_ids": ["FEATURE_UUID_1", "FEATURE_UUID_2"],
      "symbol": "AAPL",
      "timeframe": "1d",
      "start_date": "2021-01-01T00:00:00",
      "end_date": "2024-01-01T00:00:00",
      "target_horizon": 1
    },
    "cv": {
      "method": "rolling",
      "n_splits": 5,
      "test_size": 0.15,
      "min_train_size": 0.2
    }
  }'

Train Asynchronously

POST

string

/api/v1/models/{model_id}/train/async

Dispatches the same training job as a Celery background task and returns immediately. Poll the task status via GET /api/v1/tasks/{task_id}. Request Body: Identical to POST /api/v1/models/{model_id}/train. Response:

{
  "task_id": "8c78e9d3-cf2e-4a65-a9ea-19a456c2abfe",
  "status": "PENDING"
}

Errors: 404 Model not found.

Hyperparameter Tuning

Run an Optuna Tuning Study

POST

string

/api/v1/models/tune

Runs a synchronous Optuna hyperparameter search over the specified param_space. Each trial trains the plugin with a sampled configuration and evaluates it under the same CV protocol as the training endpoint. Returns the best parameters found within n_trials.

Request Body — `TuneRequest`

plugin_key

string

required

Plugin to tune, e.g. "ml.xgboost". Does not need to be associated with an existing model definition.

dataset

DatasetSpec

required

Same DatasetSpec structure as the training endpoint.

param_space

object

required

Dictionary mapping parameter names to ParamSpec objects.

Show ParamSpec fields

type

string

required

Sampling strategy: "float", "int", or "categorical".

low

number

Lower bound (inclusive) for float and int types.

high

number

Upper bound (inclusive) for float and int types.

log

boolean

If true, sample on a log scale. Applicable to float and int. Default false.

choices

array of strings

Discrete options for categorical type.

n_trials

integer

Number of Optuna trials to run. Default 30.

CVConfig

Cross-validation configuration (same structure as training). Default: rolling, 5 splits, test_size 0.15.

direction

string

Optimization direction: "minimize" or "maximize". Default "minimize".

metric

string

Metric to optimize. Common values: "mean_mse", "mean_mae", "directional_accuracy". Default "mean_mse".

Response — `TuneResponse`

best_params

object

Hyperparameter dict from the best-scoring trial.

best_score

float

Metric value achieved by the best trial.

n_trials

integer

Number of trials actually completed (may be less than requested if early stopping occurs).

Example param_space payload for XGBoost:

{
  "n_estimators": {"type": "int", "low": 50, "high": 500},
  "max_depth": {"type": "int", "low": 3, "high": 10},
  "learning_rate": {"type": "float", "low": 0.01, "high": 0.3, "log": true},
  "subsample": {"type": "float", "low": 0.5, "high": 1.0},
  "colsample_bytree": {"type": "float", "low": 0.5, "high": 1.0}
}

curl -X POST http://localhost:8000/api/v1/models/tune \
  -H 'Content-Type: application/json' \
  -d '{
    "plugin_key": "ml.xgboost",
    "dataset": {
      "feature_ids": ["FEATURE_UUID"],
      "symbol": "AAPL",
      "timeframe": "1d",
      "start_date": "2021-01-01T00:00:00",
      "end_date": "2024-01-01T00:00:00",
      "target_horizon": 1
    },
    "param_space": {
      "n_estimators": {"type": "int", "low": 50, "high": 500},
      "max_depth": {"type": "int", "low": 3, "high": 10},
      "learning_rate": {"type": "float", "low": 0.01, "high": 0.3, "log": true}
    },
    "n_trials": 50,
    "cv": {"method": "rolling", "n_splits": 5, "test_size": 0.15, "min_train_size": 0.2},
    "direction": "minimize",
    "metric": "mean_mse"
  }'

Run Tuning Asynchronously

POST

string

/api/v1/models/tune/async

Dispatches the Optuna study as a Celery background task. Returns immediately with a task_id. Request Body: Identical to POST /api/v1/models/tune. Response:

{
  "task_id": "3f91c8b2-12de-4a3c-b55f-7e0291d3b4f1",
  "status": "PENDING"
}

AutoML

Run AutoML Leaderboard

POST

string

/api/v1/models/automl

Evaluates a set of candidate plugin configurations under the same CV protocol and ranks them by the chosen metric. This is the fastest path to identifying the best algorithm family and coarse hyperparameter settings before fine-tuning with Optuna.

Request Body — `AutoMLRequest`

dataset

DatasetSpec

required

Same DatasetSpec structure as the training endpoint.

candidates

object

required

Mapping of plugin_key → fixed params dict. Each key–value pair is one candidate to evaluate, e.g. {"ml.xgboost": {"max_depth": 6}, "ml.lightgbm": {"num_leaves": 64}}.

CVConfig

Cross-validation configuration shared across all candidates.

metric

string

Ranking metric. Default "mean_mse".

Response — `AutoMLResponse`

leaderboard

array of AutoMLCandidateResponse

Show AutoMLCandidateResponse fields

plugin_key

string

The plugin evaluated.

params

object

The fixed parameters used for this candidate.

score

float

Aggregate metric value on the ranking metric.

metrics

object

Full metrics dict (MSE, MAE, directional accuracy, etc.) for this candidate.

The leaderboard is sorted by score according to the chosen metric direction (ascending for error metrics).

curl -X POST http://localhost:8000/api/v1/models/automl \
  -H 'Content-Type: application/json' \
  -d '{
    "dataset": {
      "feature_ids": ["FEATURE_UUID"],
      "symbol": "AAPL",
      "timeframe": "1d",
      "start_date": "2022-01-01T00:00:00",
      "end_date": "2024-01-01T00:00:00",
      "target_horizon": 1
    },
    "candidates": {
      "ml.xgboost": {"n_estimators": 200, "max_depth": 6},
      "ml.lightgbm": {"n_estimators": 200, "num_leaves": 64},
      "ml.catboost": {"iterations": 200, "depth": 6}
    },
    "cv": {
      "method": "rolling",
      "n_splits": 5,
      "test_size": 0.15,
      "min_train_size": 0.2
    },
    "metric": "mean_mse"
  }'

Example response:

{
  "leaderboard": [
    {
      "plugin_key": "ml.lightgbm",
      "params": {"n_estimators": 200, "num_leaves": 64},
      "score": 0.00187,
      "metrics": {"mean_mse": 0.00187, "mean_mae": 0.0312, "mean_directional_accuracy": 0.561}
    },
    {
      "plugin_key": "ml.xgboost",
      "params": {"n_estimators": 200, "max_depth": 6},
      "score": 0.00214,
      "metrics": {"mean_mse": 0.00214, "mean_mae": 0.0341, "mean_directional_accuracy": 0.548}
    },
    {
      "plugin_key": "ml.catboost",
      "params": {"iterations": 200, "depth": 6},
      "score": 0.00231,
      "metrics": {"mean_mse": 0.00231, "mean_mae": 0.0358, "mean_directional_accuracy": 0.537}
    }
  ]
}

Plugin Discovery

List Available Model Plugins

GET

string

/api/v1/models/plugins/available

Returns all plugin keys registered in the Model Plugin Registry. Response:

{
  "plugins": [
    "ml.xgboost",
    "ml.lightgbm",
    "ml.catboost",
    "ml.random_forest",
    "dl.lstm"
  ]
}

curl http://localhost:8000/api/v1/models/plugins/available

Get Default Hyperparameter Search Spaces

GET

string

/api/v1/models/plugins/search-spaces

Returns the default Optuna search spaces for each registered plugin, as defined in the platform’s search_spaces.py. The Model Builder UI reads these to pre-populate the tuning configuration form, ensuring the frontend and tuner always use the same bounds.

These are the platform-default search spaces. You can override any parameter range or add new parameters in your TuneRequest.param_space — the tuner uses your supplied spec in full.

Response:

{
  "ml.xgboost": {
    "max_depth":        {"type": "int",   "low": 3,     "high": 10},
    "learning_rate":    {"type": "float", "low": 0.005, "high": 0.3,  "log": true},
    "n_estimators":     {"type": "int",   "low": 100,   "high": 800},
    "subsample":        {"type": "float", "low": 0.5,   "high": 1.0},
    "colsample_bytree": {"type": "float", "low": 0.5,   "high": 1.0},
    "min_child_weight": {"type": "int",   "low": 1,     "high": 10}
  },
  "ml.lightgbm": {
    "num_leaves":         {"type": "int",   "low": 16,    "high": 256, "log": true},
    "learning_rate":      {"type": "float", "low": 0.005, "high": 0.3, "log": true},
    "n_estimators":       {"type": "int",   "low": 100,   "high": 800},
    "min_child_samples":  {"type": "int",   "low": 5,     "high": 100},
    "subsample":          {"type": "float", "low": 0.5,   "high": 1.0}
  },
  "ml.catboost": {
    "depth":         {"type": "int",   "low": 3,     "high": 10},
    "learning_rate": {"type": "float", "low": 0.005, "high": 0.3,  "log": true},
    "iterations":    {"type": "int",   "low": 100,   "high": 800},
    "l2_leaf_reg":   {"type": "float", "low": 1.0,   "high": 10.0, "log": true}
  },
  "ml.random_forest": {
    "n_estimators":      {"type": "int", "low": 100, "high": 600},
    "max_depth":         {"type": "int", "low": 3,   "high": 20},
    "min_samples_leaf":  {"type": "int", "low": 1,   "high": 20}
  },
  "dl.lstm": {
    "hidden_size": {"type": "categorical", "choices": ["32", "64", "128", "256"]},
    "num_layers":  {"type": "int",   "low": 1,    "high": 3},
    "dropout":     {"type": "float", "low": 0.0,  "high": 0.5},
    "lr":          {"type": "float", "low": 1e-4, "high": 1e-2, "log": true},
    "seq_len":     {"type": "int",   "low": 10,   "high": 60},
    "epochs":      {"type": "int",   "low": 10,   "high": 50}
  }
}

curl http://localhost:8000/api/v1/models/plugins/search-spaces

Strategies

Features & Models

Backtesting & Validation

Intelligence & Tracking

Models API — CV Training, Optuna Tuning, and AutoML

Model CRUD

Create a Model Definition

Request Body

Response — `ModelRead`

List Model Definitions

Query Parameters

Get a Model Definition

Update a Model Definition

Request Body (all optional)

Delete a Model Definition

Training

Train with Time-Series Cross-Validation

Path Parameter

Request Body — `TrainRequest`

Response — `TrainResponse`

Train Asynchronously

Hyperparameter Tuning

Run an Optuna Tuning Study

Request Body — `TuneRequest`

Response — `TuneResponse`

Run Tuning Asynchronously

AutoML

Run AutoML Leaderboard

Request Body — `AutoMLRequest`

Response — `AutoMLResponse`

Plugin Discovery

List Available Model Plugins

Get Default Hyperparameter Search Spaces

Build docs developers (and LLMs) love

Strategies

Features & Models

Backtesting & Validation

Intelligence & Tracking

Documentation Index

​Model CRUD

​Create a Model Definition

​Request Body

​Response — ModelRead

​List Model Definitions

​Query Parameters

​Get a Model Definition

​Update a Model Definition

​Request Body (all optional)

​Delete a Model Definition

​Training

​Train with Time-Series Cross-Validation

​Path Parameter

​Request Body — TrainRequest

​Response — TrainResponse

​Train Asynchronously

​Hyperparameter Tuning

​Run an Optuna Tuning Study

​Request Body — TuneRequest

​Response — TuneResponse

​Run Tuning Asynchronously

​AutoML

​Run AutoML Leaderboard

​Request Body — AutoMLRequest

​Response — AutoMLResponse

​Plugin Discovery

​List Available Model Plugins

​Get Default Hyperparameter Search Spaces

Build docs developers (and LLMs) love

Model CRUD

Create a Model Definition

Request Body

Response — `ModelRead`

List Model Definitions

Query Parameters

Get a Model Definition

Update a Model Definition

Request Body (all optional)

Delete a Model Definition

Training

Train with Time-Series Cross-Validation

Path Parameter

Request Body — `TrainRequest`

Response — `TrainResponse`

Train Asynchronously

Hyperparameter Tuning

Run an Optuna Tuning Study

Request Body — `TuneRequest`

Response — `TuneResponse`

Run Tuning Asynchronously

AutoML

Run AutoML Leaderboard

Request Body — `AutoMLRequest`

Response — `AutoMLResponse`

Plugin Discovery

List Available Model Plugins

Get Default Hyperparameter Search Spaces