Overview
Dataset class for TimeMoE (Time Series Mixture of Experts), supporting both evaluation and fine-tuning tasks.Class signature
Parameters
Dataset name.
Name of the datetime column.
Path to CSV file.
Batch size for DataLoader.
Mode of operation:
"train" or "test".Train/val/test split boundaries. Default splits as: 50% train, 20% val, 30% test.
Task type:
"evaluation" or "finetune".Stride for windowing time series data.
Historical context length.
Forecast horizon length.
Extra backend-specific options.
Methods
__len__()
Get the length of the dataset.
int - Number of samples available for iteration.
__getitem__(index)
Get a data sample for the given index.
index(int): Index of the data sample.
- For evaluation task:
(input_seq, forecast_seq) - For finetune task:
(input_seq, forecast_seq, loss_mask)
get_data_loader()
Get a data loader for the dataset.
DataLoader - PyTorch DataLoader object for the dataset.
_denormalize_data(data)
Denormalizes the data.
data(np.ndarray): Normalized data.
np.ndarray - Denormalized data.
Example usage
Evaluation task
Fine-tuning task
Task-specific outputs
Evaluation task
Returns:(input_seq, forecast_seq)
input_seq: Historical context of shape(context_len,)per channelforecast_seq: Target forecast of shape(horizon_len,)per channel
Finetune task
Returns:(input_seq, forecast_seq, loss_mask)
input_seq: Historical context of shape(context_len,)per channelforecast_seq: Next time step prediction of shape(1,)per channelloss_mask: Mask of ones with shape(context_len,)
Features
- Automatic StandardScaler normalization fitted on training data
- Per-channel processing for multivariate time series
- Automatic padding for short sequences
- Support for both zero-shot evaluation and fine-tuning
- Automatic horizon length adjustment (max 30% of data length)
Notes
- The dataset processes each channel independently
- Supports special boundary values:
[-1, -1, -1]uses all data for training - For fine-tuning, horizon is set to 1 (next step prediction)
- Data is automatically normalized using StandardScaler
- Output shape:
(batch_size, seq_len)where each sample is a single channel