Overview
Dataset wrapper for MOMENT (Multi-task and Multi-domain) model, supporting forecasting, imputation, anomaly detection, and classification tasks.Class signature
Parameters
Name of the dataset.
Name of the datetime column.
Path to CSV file.
Batch size for DataLoader.
Mode of operation:
'train' or 'test'.Train/val/test split boundaries. Default splits as: 50% train, 20% val, 30% test.
Forecast horizon length.
Task type:
'forecasting', 'imputation', 'detection', or 'classification'.Column name for labels in classification. Defaults to
'label' if not provided.Stride for windowing. In test mode with horizon_len > 0, stride is set to horizon_len.
Extra options forwarded to DataLoader.
Methods
__len__()
Get the total number of data samples.
int - Number of samples available for iteration.
__getitem__(index)
Get a data sample by index.
index(int): Index of the data sample.
get_data_loader()
Get a DataLoader for the dataset.
DataLoader - PyTorch DataLoader for the dataset.
_denormalize_data(data)
Denormalize the input data.
data(np.ndarray): Input data array.
np.ndarray - Denormalized data array.
Example usage
Task-specific outputs
Forecasting
Returns:(input_seq, input_mask, forecast_seq)
input_seq: Input sequence of shape(n_channels, seq_len)input_mask: Mask indicating valid values (1) vs padded values (0)forecast_seq: Target forecast sequence
Imputation
Returns:(input_seq, input_mask)
Detection
Returns:(input_seq, input_mask, labels)
labels: Binary labels for anomaly detection
Classification
Returns:(input_seq, input_mask, labels)
labels: Class labels for classification
Features
- Automatic data scaling using StandardScaler
- Support for multivariate time series (max 64 channels per chunk)
- Automatic padding for short sequences
- Chunking for datasets with many channels