Skip to main content

When to Fine-Tune vs Zero-Shot

Use zero-shot when:
  • You have limited or no training data
  • Your data distribution is similar to pre-training data
  • You need quick results without training time
  • You’re prototyping or exploring model capabilities
Use fine-tuning when:
  • You have domain-specific data with unique patterns
  • Zero-shot performance is insufficient
  • You can afford training time (typically 5-10 epochs)
  • Your data distribution differs from general time-series
Fine-tuning typically improves performance by 10-30% compared to zero-shot inference, depending on the dataset and model.

Fine-Tuning Workflow

All Samay models follow a consistent fine-tuning pattern:
1

Prepare training data

Create a dataset instance for training:
from samay.dataset import LPTMDataset

train_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="data/ETTh1.csv",
    mode="train",  # Important: use 'train' mode
    horizon=192,
)
2

Configure model for fine-tuning

Set freeze parameters to control what layers are trainable:
from samay.model import LPTMModel

config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "head_dropout": 0,
    "weight_decay": 0,
    "max_patch": 16,
    "freeze_encoder": True,    # Freeze patch embedding
    "freeze_embedder": True,   # Freeze transformer
    "freeze_head": False,      # Train the forecasting head
    "freeze_segment": True,    # Freeze segmentation
}
model = LPTMModel(config)
For faster training with good results, freeze the encoder and only train the head. For maximum performance, unfreeze all layers (but expect longer training).
3

Run fine-tuning

Call the finetune() method:
finetuned_model = model.finetune(train_dataset)
# Output:
# Epoch 0: Train loss: 0.594
# Epoch 1: Train loss: 0.504
# Epoch 2: Train loss: 0.479
# ...
4

Evaluate fine-tuned model

Test on validation data:
val_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="data/ETTh1.csv",
    mode="test",
    horizon=192,
)

metrics, trues, preds, histories = model.evaluate(
    val_dataset, task_name="forecasting"
)
print(f"MSE: {metrics}")

Real Code Examples

LPTM Fine-Tuning

Complete example from lptm.ipynb:
from samay.model import LPTMModel
from samay.dataset import LPTMDataset

# Configure model
config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "head_dropout": 0,
    "weight_decay": 0,
    "max_patch": 16,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
    "freeze_segment": True,
}
model = LPTMModel(config)

# Prepare training data
train_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="data/ETTh1.csv",
    mode="train",
    horizon=192,
)

# Fine-tune
finetuned_model = model.finetune(train_dataset)
# Epoch 0: Train loss: 0.594
# Epoch 1: Train loss: 0.504
# Epoch 2: Train loss: 0.479
# Epoch 3: Train loss: 0.465
# Epoch 4: Train loss: 0.454

# Evaluate
val_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="data/ETTh1.csv",
    mode="test",
    horizon=192,
)

metrics, trues, preds, histories = model.evaluate(
    val_dataset, task_name="forecasting"
)
print(f"Final Loss: {metrics}")  # ~0.441

MOMENT Fine-Tuning for Classification

Fine-tuning for time-series classification:
from samay.model import MomentModel
from samay.dataset import MomentDataset

repo = "AutonLab/MOMENT-1-large"
config = {
    "task_name": "classification",
    "n_channels": 1,
    "num_class": 5
}
mmt = MomentModel(config=config, repo=repo)

train_dataset = MomentDataset(
    name="ecg5000",
    path="data/ECG5000_TRAIN.csv",
    batchsize=64,
    mode="train",
    task_name="classification",
)

# Fine-tune with custom learning rate
finetuned_model = mmt.finetune(
    train_dataset,
    task_name="classification",
    epoch=10,
    lr=0.1
)
# Epoch 0: Train loss: 1.200
# Epoch 1: Train loss: 0.856
# ...
# Epoch 9: Train loss: 0.451

# Evaluate
test_dataset = MomentDataset(
    name="ecg5000",
    path="data/ECG5000_TEST.csv",
    batchsize=64,
    mode="test",
    task_name="classification",
)

accuracy, embeddings, labels = mmt.evaluate(
    test_dataset, task_name="classification"
)
print(f"Accuracy: {accuracy}")  # ~0.857

Chronos Fine-Tuning

from samay.model import ChronosModel
from samay.dataset import ChronosDataset

repo = "amazon/chronos-t5-small"
chronos_model = ChronosModel(repo=repo)

train_dataset = ChronosDataset(
    name="ett",
    datetime_col="date",
    path="data/ETTh1.csv",
    mode="train",
    batch_size=8
)

chronos_model.finetune(train_dataset)
# Epoch 0, Loss: 3.874
# Epoch 1, Loss: 3.848
# Epoch 2, Loss: 3.808
# Epoch 3, Loss: 3.775
# Epoch 4, Loss: 3.743

Training Parameters and Best Practices

Key Hyperparameters

epoch
int
default:"5"
Number of training epochs. Start with 5, increase if loss is still decreasing.
lr
float
default:"0.001"
Learning rate. Try 0.001 for full fine-tuning, 0.01-0.1 for head-only training.
head_dropout
float
default:"0.1"
Dropout rate for the forecasting head. Helps prevent overfitting.
weight_decay
float
default:"0"
L2 regularization strength. Use 0.01-0.001 for small datasets.
freeze_encoder
bool
default:"true"
Whether to freeze the patch embedding layer. Set to false for full fine-tuning.
freeze_embedder
bool
default:"true"
Whether to freeze the transformer encoder. Set to false for full fine-tuning.
freeze_head
bool
default:"false"
Whether to freeze the task-specific head. Almost always keep false during fine-tuning.

Fine-Tuning Strategies

When to use: Limited data (<1000 samples), quick experiments
config = {
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
    "lr": 0.1,  # Higher learning rate OK for head-only
}
Training time: 1-2 minutes for 5 epochs
Performance: 70-80% of full fine-tuning
When to use: Moderate data (1000-10000 samples), domain adaptation
config = {
    "freeze_encoder": False,  # Unfreeze encoder
    "freeze_embedder": True,   # Keep transformer frozen
    "freeze_head": False,
    "lr": 0.01,
}
Training time: 5-10 minutes for 5 epochs
Performance: 85-95% of full fine-tuning
When to use: Large datasets (>10000 samples), maximum accuracy needed
config = {
    "freeze_encoder": False,
    "freeze_embedder": False,  # Unfreeze transformer
    "freeze_head": False,
    "lr": 0.001,  # Lower learning rate to avoid catastrophic forgetting
    "weight_decay": 0.01,  # Add regularization
}
Training time: 15-30 minutes for 5 epochs
Performance: Best possible accuracy

Common Issues and Solutions

Loss not decreasing?
  • Try unfreezing more layers
  • Increase learning rate (0.01 → 0.1)
  • Train for more epochs (5 → 10)
  • Check if data is properly normalized
Overfitting (train loss low, val loss high)?
  • Increase head_dropout (0.1 → 0.3)
  • Add weight_decay (0 → 0.01)
  • Freeze more layers
  • Reduce number of epochs
Training too slow?
  • Use head-only fine-tuning
  • Reduce batch size in dataset config
  • Reduce context length
  • Use a smaller model variant

Next Steps

For more fine-tuning examples, explore the example notebooks.

Build docs developers (and LLMs) love