TimesfmDataset

Overview

Dataset class compatible with TimesFM (Time Series Foundation Model) for time series forecasting.

Class signature

class TimesfmDataset(BaseDataset):
    def __init__(
        self,
        name: str = None,
        datetime_col: str = "ds",
        path: str = None,
        batchsize: int = 4,
        mode: str = "train",
        boundaries: tuple = (0, 0, 0),
        context_len: int = 128,
        horizon_len: int = 32,
        freq: str = "h",
        normalize: bool = False,
        stride: int = 10,
        **kwargs,
    )

Parameters

name

str

default:"None"

Dataset name used to locate data.

datetime_col

str

default:"ds"

Datetime column name in the CSV.

path

str

default:"None"

Path to CSV file. If None, loader from BaseDataset will be used.

batchsize

int

default:"4"

Batch size for dataloaders.

mode

str

default:"train"

Mode of use: "train" or "test".

boundaries

tuple

default:"(0, 0, 0)"

Train/val/test split boundaries. Default splits as: 50% train, 20% val, 30% test.

context_len

int

default:"128"

Historical context length.

horizon_len

int

default:"32"

Forecast horizon length.

freq

str

default:"h"

Data frequency code (e.g., "h" for hourly, "d" for daily, "m" for monthly).

normalize

bool

default:"False"

Whether to normalize input features.

stride

int

default:"10"

Stride used when creating windows from the timeseries.

kwargs

dict

Extra backend-specific options.

Methods

`get_data_loader()`

Get a DataLoader for the dataset.

def get_data_loader()

Returns: DataLoader - DataLoader for the dataset.

`preprocess_train_batch(data)`

Preprocess a training batch.

def preprocess_train_batch(self, data: tuple)

Parameters:

data (tuple): Input data tuple.

Returns: dict - Preprocessed data dictionary with keys 'input_ts' and 'actual_ts'.

`preprocess_eval_batch(data)`

Preprocess an evaluation batch.

def preprocess_eval_batch(self, data: tuple)

Parameters:

data (tuple): Input data tuple.

Returns: dict - Preprocessed data dictionary with keys 'input_ts' and 'actual_ts'.

`preprocess(data)`

Preprocess the input data.

def preprocess(self, data: tuple)

Parameters:

data (tuple): Input data tuple.

Returns: dict - Preprocessed data dictionary.

`_denormalize_data(data)`

Denormalize the input data.

def _denormalize_data(self, data: np.ndarray)

Parameters:

data (np.ndarray): Input data array.

Returns: np.ndarray - Denormalized data array.

Example usage

from samay.dataset import TimesfmDataset

dataset = TimesfmDataset(
    path="data/hourly_data.csv",
    datetime_col="timestamp",
    context_len=256,
    horizon_len=64,
    freq="h",
    normalize=True,
    mode="train"
)

loader = dataset.get_data_loader()
for batch in loader:
    input_ts = batch['input_ts']
    actual_ts = batch['actual_ts']
    # Training logic here

Notes

The dataset automatically adjusts horizon length to be at most 30% of the data length
Supports special boundary values: (-1, -1, -1) uses all data for training
When normalize=True, a StandardScaler is fitted on the training data

Models

Datasets

Metrics

Utilities

TimesfmDataset

Overview

Class signature

Parameters

Methods

`get_data_loader()`

`preprocess_train_batch(data)`

`preprocess_eval_batch(data)`

`preprocess(data)`

`_denormalize_data(data)`

Example usage

Notes

Build docs developers (and LLMs) love

Models

Datasets

Metrics

Utilities

​Overview

​Class signature

​Parameters

​Methods

​get_data_loader()

​preprocess_train_batch(data)

​preprocess_eval_batch(data)

​preprocess(data)

​_denormalize_data(data)

​Example usage

​Notes

Build docs developers (and LLMs) love

Overview

Class signature

Parameters

Methods

`get_data_loader()`

`preprocess_train_batch(data)`

`preprocess_eval_batch(data)`

`preprocess(data)`

`_denormalize_data(data)`

Example usage

Notes