PEFT Training

🤗 PEFT (Parameter-Efficient Fine-Tuning) enables efficient adaptation of large pre-trained models by training only a small number of additional parameters. This is especially useful for fine-tuning large Vision-Language-Action (VLA) models like SmolVLA, π₀, and GR00T.

What is PEFT?

PEFT methods add trainable adapter modules to a frozen pre-trained model. Instead of fine-tuning all billions of parameters, you train only millions of adapter parameters:

Full Fine-tuning: Update all 7B parameters
LoRA (rank=64): Update only ~100M adapter parameters (1.4% of total)
Result: Similar performance with much less compute and memory

Installation

Install LeRobot with PEFT support:

pip install lerobot[peft]

Or install PEFT separately:

pip install peft

Quick Start

Fine-tune SmolVLA with LoRA:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --policy.repo_id=your_username/smolvla_pickplace \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --env.type=aloha \
  --env.task=AlohaInsertion-v0 \
  --steps=50000 \
  --batch_size=32 \
  --peft.method_type=LORA \
  --peft.r=64 \
  --policy.optimizer_lr=1e-3 \
  --policy.scheduler_decay_lr=1e-4

Key differences from full fine-tuning:

--policy.path: Load pre-trained model
--peft.method_type=LORA: Use LoRA adapters
--peft.r=64: LoRA rank (higher = more parameters)
Higher learning rate (1e-3 vs 1e-4 for full fine-tuning)

Supported Methods

LoRA (Low-Rank Adaptation)

LoRA is the most popular PEFT method. It adds low-rank matrices to attention layers:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=16 \
  --peft.lora_dropout=0.1

Parameters:

r: Rank of adapter matrices (higher = more capacity)
- r=8: Very lightweight (~25M params)
- r=32: Balanced (~50M params)
- r=64: High capacity (~100M params)
lora_alpha: Scaling factor (typically r/2 or r/4)
lora_dropout: Dropout rate for adapters

When to use: General purpose fine-tuning, good balance of efficiency and performance.

IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

IA³ uses even fewer parameters by learning scaling factors:

lerobot-train \
  --policy.path=lerobot/pi0_base \
  --peft.method_type=IA3

When to use: When you have very limited compute or want the smallest possible adapter.

AdaLoRA (Adaptive LoRA)

Adaptively allocates rank across different layers:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=ADALORA \
  --peft.target_r=8 \
  --peft.init_r=12

When to use: When you want to automatically find the optimal rank distribution.

Targeting Modules

Default Targets

By default, LoRA targets attention projection layers and task-specific heads:

# For SmolVLA
default_targets = [
    "q_proj",  # Query projection
    "v_proj",  # Value projection  
    "state_proj",  # State encoder
    "action_in_proj",  # Action encoder
    "action_out_proj",  # Action decoder
]

Custom Targets

Specify custom modules to adapt:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.target_modules='["q_proj","v_proj","k_proj","o_proj"]'

Using Regex

Target modules with regex patterns:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.target_modules='(model\.vlm_with_expert\.lm_expert\..*\.(down|gate|up)_proj|.*\.(state_proj|action_in_proj|action_out_proj))'

This targets:

All MLP layers in the language model expert
State and action projection layers

Finding Module Names

Print model architecture to find module names:

from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

policy = SmolVLAPolicy.from_pretrained("lerobot/smolvla_base")

# Print all module names
for name, module in policy.named_modules():
    print(name)

Full Fine-tuning Specific Modules

For some modules, you may want full fine-tuning instead of adapters:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.target_modules='["q_proj","v_proj"]' \
  --peft.full_training_modules='["state_proj","action_out_proj"]'

This:

Adds LoRA adapters to attention layers
Fully fine-tunes state and action projections

Fine-tuning SmolVLA

Complete example for fine-tuning SmolVLA on a manipulation task:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --policy.repo_id=your_username/smolvla_libero_spatial \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --policy.output_features=null \
  --policy.input_features=null \
  --policy.optimizer_lr=1e-3 \
  --policy.scheduler_decay_lr=1e-4 \
  --env.type=libero \
  --env.task=libero_spatial \
  --steps=100000 \
  --batch_size=32 \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=16 \
  --peft.lora_dropout=0.1 \
  --eval_freq=10000 \
  --save_freq=10000 \
  --log_freq=100

Key settings:

output_features=null, input_features=null: Auto-infer from dataset
Learning rate 10x higher than full fine-tuning
Batch size 32 (adjust based on GPU memory)
Evaluate every 10k steps

Fine-tuning π₀

Fine-tune Physical Intelligence’s π₀ policy:

lerobot-train \
  --policy.path=lerobot/pi0_base \
  --policy.repo_id=your_username/pi0_aloha_insertion \
  --dataset.repo_id=lerobot/aloha_sim_insertion_human \
  --env.type=aloha \
  --env.task=AlohaInsertion-v0 \
  --steps=50000 \
  --batch_size=16 \
  --peft.method_type=LORA \
  --peft.r=32 \
  --policy.optimizer_lr=5e-4

Memory and Speed Benefits

Memory Usage

PEFT drastically reduces memory requirements:

Method	Trainable Params	Memory (fp16)	Speedup
Full Fine-tuning	7B	~28 GB	1.0x
LoRA (r=64)	100M	~16 GB	1.8x
LoRA (r=32)	50M	~14 GB	2.0x
LoRA (r=8)	25M	~12 GB	2.2x

Training Speed

PEFT training is faster because:

Fewer gradients to compute
Less memory movement
Faster optimizer updates

Typical speedup: 1.5-2x compared to full fine-tuning.

Hyperparameter Tuning

Learning Rate

PEFT typically uses higher learning rates:

# Full fine-tuning
--policy.optimizer_lr=1e-4

# LoRA fine-tuning
--policy.optimizer_lr=1e-3  # 10x higher

Start with 5-10x the full fine-tuning learning rate.

LoRA Rank

Balance between capacity and efficiency:

# Lightweight (good for small datasets)
--peft.r=8

# Balanced (recommended default)
--peft.r=32

# High capacity (for complex tasks)
--peft.r=64

# Very high capacity (approaching full fine-tuning)
--peft.r=128

LoRA Alpha

Scaling factor for adapter outputs:

# Conservative (less adapter influence)
--peft.lora_alpha=8

# Balanced (recommended: r/2)
--peft.r=64 --peft.lora_alpha=32

# Aggressive (more adapter influence)
--peft.lora_alpha=64

Rule of thumb: Set lora_alpha = r/2 or r/4.

Loading PEFT Models

Load fine-tuned PEFT models:

from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

# Load base model with adapters
policy = SmolVLAPolicy.from_pretrained(
    "your_username/smolvla_finetuned",
    use_peft=True
)

policy.eval()
action = policy.select_action(observation)

PEFT adapters are stored alongside the base model weights.

Merging Adapters

Merge adapters into base model for deployment:

from peft import PeftModel
from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

# Load model with adapter
policy = SmolVLAPolicy.from_pretrained(
    "lerobot/smolvla_base",
    use_peft=True
)
policy = PeftModel.from_pretrained(policy, "your_username/adapter")

# Merge adapter weights into base model
policy = policy.merge_and_unload()

# Save merged model
policy.save_pretrained("merged_model")

Merged models:

Load faster (no adapter overhead)
Use slightly less memory
Cannot be “un-merged”

Multi-GPU PEFT Training

Scale PEFT training across GPUs:

accelerate launch --num_processes=4 \
  -m lerobot.scripts.lerobot_train \
  --policy.path=lerobot/smolvla_base \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --peft.method_type=LORA \
  --peft.r=64 \
  --batch_size=32 \
  --steps=100000

PEFT is very efficient for multi-GPU training due to reduced memory and communication overhead.

Best Practices

Start with default settings

Use recommended defaults:

lerobot-train \
  --policy.path=lerobot/smolvla_base \
  --peft.method_type=LORA \
  --peft.r=64 \
  --peft.lora_alpha=32 \
  --policy.optimizer_lr=1e-3

Use higher learning rates

PEFT converges faster with higher learning rates:

# Full fine-tuning: 1e-4
# PEFT: 5e-4 to 1e-3
--policy.optimizer_lr=1e-3

Monitor validation loss

PEFT can overfit more easily:

lerobot-train \
  --peft.method_type=LORA \
  --dataset.train_fraction=0.9 \
  --eval_freq=5000

Start with smaller rank

Begin with r=32 and increase if needed:

# Try r=32 first
--peft.r=32

# If underfitting, increase
--peft.r=64

Match training data scale to rank

Smaller datasets need smaller ranks:

< 100 episodes: r=8-16

100-500 episodes: r=32

500+ episodes: r=64

Troubleshooting

Model not learning

Increase learning rate or rank:

--policy.optimizer_lr=5e-3 --peft.r=128

Out of memory

Reduce rank or batch size:

--peft.r=16 --batch_size=16

Overfitting

Reduce rank or add dropout:

--peft.r=32 --peft.lora_dropout=0.2

Next Steps

Multi-GPU Training - Scale PEFT across GPUs
Evaluate Policies - Test your fine-tuned model
SmolVLA Guide - Learn about SmolVLA architecture
PEFT Documentation - Deep dive into PEFT methods

Get Started

Core Concepts

Tutorials

Datasets

Simulation

Inference

Advanced

What is PEFT?

Installation

Quick Start

Supported Methods

LoRA (Low-Rank Adaptation)

IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

AdaLoRA (Adaptive LoRA)

Targeting Modules

Default Targets

Custom Targets

Using Regex

Finding Module Names

Full Fine-tuning Specific Modules

Fine-tuning SmolVLA

Fine-tuning π₀

Memory and Speed Benefits

Memory Usage

Training Speed

Hyperparameter Tuning

Learning Rate

LoRA Rank

LoRA Alpha

Loading PEFT Models

Merging Adapters

Multi-GPU PEFT Training

Best Practices

Troubleshooting

Model not learning

Out of memory

Overfitting

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Tutorials

Datasets

Simulation

Inference

Advanced

Documentation Index

​What is PEFT?

​Installation

​Quick Start

​Supported Methods

​LoRA (Low-Rank Adaptation)

​IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

​AdaLoRA (Adaptive LoRA)

​Targeting Modules

​Default Targets

​Custom Targets

​Using Regex

​Finding Module Names

​Full Fine-tuning Specific Modules

​Fine-tuning SmolVLA

​Fine-tuning π₀

​Memory and Speed Benefits

​Memory Usage

​Training Speed

​Hyperparameter Tuning

​Learning Rate

​LoRA Rank

​LoRA Alpha

​Loading PEFT Models

​Merging Adapters

​Multi-GPU PEFT Training

​Best Practices

​Troubleshooting

​Model not learning

​Out of memory

​Overfitting

​Next Steps

Build docs developers (and LLMs) love

What is PEFT?

Installation

Quick Start

Supported Methods

LoRA (Low-Rank Adaptation)

IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations)

AdaLoRA (Adaptive LoRA)

Targeting Modules

Default Targets

Custom Targets

Using Regex

Finding Module Names

Full Fine-tuning Specific Modules

Fine-tuning SmolVLA

Fine-tuning π₀

Memory and Speed Benefits

Memory Usage

Training Speed

Hyperparameter Tuning

Learning Rate

LoRA Rank

LoRA Alpha

Loading PEFT Models

Merging Adapters

Multi-GPU PEFT Training

Best Practices

Troubleshooting

Model not learning

Out of memory

Overfitting

Next Steps