samay.utils module provides utility functions for working with time series datasets and data formats.
Dataset Management
get_gifteval_datasets
Get hierarchical and direct datasets from the GIFT-Eval benchmark.Path to the directory containing GIFT-Eval datasets
Dictionary mapping dataset paths to tuples of
(frequency, size_in_MB)Example
get_monash_datasets
Get datasets from the Monash Time Series Forecasting Archive.Path to the directory containing Monash datasets
Dictionary mapping dataset paths to tuples of
(inferred_frequency, size_in_MB), sorted by file sizeExample
get_tsb_ad_datasets
Read TSB-AD (Time Series Benchmark for Anomaly Detection) datasets stored as CSV files.Path to the directory containing TSB-AD CSV files
Dictionary mapping absolute file paths to tuples of
(inferred_freq_or_None, size_in_MB)Example
Data Conversion
ts_to_csv
Convert a.ts file (time series format) to a .csv file.
Path to the input
.ts filePath to the output
.csv fileValue to replace missing values with in the
.ts fileExample
arrow_to_csv
Convert Arrow format datasets to CSV.Path to the directory containing Arrow format data
Frequency string for the time series (e.g., “1H”, “1D”, “1M”)
Example
get_multivariate_data
Extract multivariate time series data from a DataFrame.DataFrame containing the multivariate time series data
Name of the column containing labels
Multivariate data array of shape
(num_samples, num_channels, num_timesteps)Array of labels
Example
Configuration Management
load_args
Load arguments from a JSON file.Path to the JSON file containing arguments
Dictionary of loaded arguments
read_yaml
Read a YAML configuration file.Path to the YAML file
Dictionary containing the YAML configuration
prep_finetune_config
Prepare fine-tuning configuration from a YAML file or dictionary.Path to the YAML configuration file
Configuration dictionary (alternative to
file_path)Processed configuration dictionary with keys:
batch_size: Batch size for trainingmax_epochs: Maximum number of epochsseed: Random seedtf32: TF32 settingmod_torch: PyTorch trainer modifications
Either
file_path or config must be provided, but not both.GPU Utilities
get_least_used_gpu
Get the GPU device with the least memory usage.Index of the least used GPU device, or
-1 if no GPU is available or if an error occursExample
DataLoader Utilities
cleanup_dataloader
Best-effort shutdown for PyTorch DataLoader workers to prevent resource leaks.PyTorch DataLoader to clean up
Example
This function stops worker processes and queues, preventing semaphore leaks in multi-process DataLoaders.