Class Signature
TimeMoEModel class implements Time-MoE, a time series forecasting model that uses a Mixture of Experts (MoE) architecture for efficient and accurate predictions.
Initialization Parameters
Model configuration dictionary. Used when initializing a new model without pre-trained weights. Must contain TimeMoeConfig parameters.
Hugging Face model repository ID for loading pre-trained models.
Methods
finetune()
Dataset for finetuning. The dataset’s horizon_len is automatically set in the model config.
Optional keyword arguments (currently uses default hyperparameters: lr=1e-4, epochs=5).
The model is finetuned in-place and set to evaluation mode after training.
evaluate()
Dataset for evaluation.
If True, return only metrics.
When
metric_only=True:Dictionary containing:mse: Mean Squared Errormae: Mean Absolute Errormase: Mean Absolute Scaled Errormape: Mean Absolute Percentage Errorrmse: Root Mean Squared Errornrmse: Normalized RMSEsmape: Symmetric Mean Absolute Percentage Errormsis: Mean Scaled Interval Scorend: Normalized Deviation
metric_only=False:Tuple of (metrics, trues, preds, histories):metrics: Dictionary of metrics (as above)trues: True values, shape (batch_size, n_channels, horizon_len)preds: Predicted values, shape (batch_size, n_channels, horizon_len)histories: Historical context, shape (batch_size, n_channels, context_len)
plot()
Dataset for plotting.
Additional keyword arguments forwarded to visualization.
This method does not return a value. It displays visualizations.
Usage Example
Notes
- Time-MoE uses a Mixture of Experts architecture to efficiently handle diverse time series patterns
- The model uses autoregressive generation via the
generate()method for inference - During finetuning, the model’s
horizon_lengthsconfig is automatically updated to match the dataset - Data is automatically denormalized during evaluation if the dataset was normalized
- The model reshapes predictions to (n_channels, batch, horizon) format internally
- Loss masks can be provided during finetuning to handle variable-length sequences