Overview
Dataset class for Moirai, a universal time series forecasting model. Handles data preprocessing, transformation, and windowing for training and testing.Class signature
Parameters
Dataset name.
Column containing datetimes.
Path to CSV file.
Train/val/test split boundaries. Default splits as: 80% train, 20% test (or 60%/20%/20% if htune=True).
Historical context length.
Forecast horizon length.
Size of patches for patching mechanism.
Batch size for DataLoader.
Target frequency for resampling (e.g.,
"h" for hourly, "d" for daily). If None, uses inferred frequency.Start date for data subset (format: YYYY-MM-DD).
End date for data subset (format: YYYY-MM-DD).
Resampling operation:
"mean", "sum", "pad", "ffill", or "bfill".Whether to normalize the data using StandardScaler.
Mode:
"train", "val", or "test".Hyperparameter tuning mode. If True, uses 60/20/20 split instead of 80/20.
Configuration dict with keys:
target_dim(int): Target dimension (default: 1)feat_dynamic_real_dim(int): Dynamic real features dimension (default: 0)past_feat_dynamic_real_dim(int): Past dynamic real features dimension (default: 0)
Extra options for DataLoader (e.g.,
num_workers, pin_memory, persistent_workers).Methods
__len__()
Return the number of items in the dataset.
int - Number of samples in the dataset.
__getitem__(idx)
Get a data sample by index.
idx(int): Index of the data sample.
get_dataloader()
Returns the iterator for data batches.
DataLoader - PyTorch DataLoader for the dataset.
_denormalize_data(data)
Denormalizes the data.
data(np.ndarray): Normalized data.
np.ndarray - Denormalized data.
Example usage
Advanced usage with resampling
Features
- Automatic frequency inference from datetime index
- Support for data resampling with multiple operations
- Forward and backward fill for missing values
- StandardScaler normalization fitted on training data
- Automatic windowing for test data
- Support for multivariate time series with dynamic features
- Patch-based processing for efficient computation
Data transformations
The dataset applies the following transformations:- Convert target data to numpy array
- Add observed values indicator for handling missing data
- Expand dimensions if needed (for univariate series)
- Add past target, observed target, and padding indicators
- Handle dynamic real features if specified