Troubleshooting Guide

This guide provides solutions to common problems you might encounter while using LeRobot.

Installation Issues

Python version incompatibility

Problem: Import errors or syntax errors after installation.Solution: LeRobot requires Python ≥3.12. Check your version:

python --version
# Should show Python 3.12.x or higher

If you have an older version, create a new environment:

conda create -y -n lerobot python=3.12
conda activate lerobot
pip install lerobot

ffmpeg not found or missing libsvtav1

Problem: Errors like ffmpeg not found or Encoder 'libsvtav1' not found.Solution: Install ffmpeg with libsvtav1 support:

# With conda (recommended)
conda install ffmpeg=7.1.1 -c conda-forge

# Verify installation
ffmpeg -version
ffmpeg -encoders | grep svt

ffmpeg 8.X is not yet supported. Use version 7.X.

WSL (Windows) installation issues

Problem: Errors related to evdev or input devices on Windows Subsystem for Linux.Solution: Install evdev explicitly:

conda install evdev -c conda-forge

Permission denied errors

Problem: Permission errors when installing packages.Solution: Don’t use sudo with pip. Instead:

# Use virtual environment (recommended)
conda create -n lerobot python=3.12
conda activate lerobot
pip install lerobot

# Or install for user only
pip install --user lerobot

GPU and CUDA Issues

CUDA out of memory errors

Problem: RuntimeError: CUDA out of memory during training or inference.Solutions:

Reduce batch size:

lerobot-train \
  --policy=act \
  --dataset.repo_id=lerobot/pusht \
  --training.batch_size=8  # Try smaller values

Enable gradient accumulation:

lerobot-train \
  --policy=act \
  --dataset.repo_id=lerobot/pusht \
  --training.batch_size=4 \
  --training.gradient_accumulation_steps=4

Use mixed precision (AMP):

policy.config.use_amp = True

Clear CUDA cache:

import torch
torch.cuda.empty_cache()

Use a smaller model variant or reduce sequence length

CUDA not available

Problem: torch.cuda.is_available() returns False.Solutions:

Check NVIDIA driver:

nvidia-smi

Reinstall PyTorch with CUDA:

# For CUDA 11.8
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

# For CUDA 12.1
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

Verify installation:

import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"GPU: {torch.cuda.get_device_name(0)}")

Multi-GPU training issues

Problem: Errors when using multiple GPUs with DDP.Solution: Use torchrun with correct configuration:

# For 4 GPUs
torchrun --nproc_per_node=4 -m lerobot.scripts.train \
  --policy=act \
  --dataset.repo_id=lerobot/pusht

# Ensure consistent batch size across GPUs
# Total batch size = batch_size * num_gpus

Dataset Issues

Dataset not found on Hub

Problem: FileNotFoundError or DatasetNotFoundError when loading dataset.Solutions:

Verify dataset exists:

from huggingface_hub import list_datasets

datasets = [d.id for d in list_datasets(task_categories="robotics", tags=["LeRobot"])]
print("Available datasets:", datasets)

Check authentication (for private datasets):

huggingface-cli login

Use correct repo_id format:

# Correct
dataset = LeRobotDataset("lerobot/pusht")

# Incorrect
dataset = LeRobotDataset("pusht")  # Missing namespace

Video decoding errors

Problem: Errors when loading video frames from dataset.Solutions:

Verify ffmpeg installation:

ffmpeg -version
ffmpeg -decoders | grep h264

Clear dataset cache and re-download:

rm -rf ~/.cache/huggingface/lerobot/<dataset-name>

Check disk space:

df -h ~/.cache/huggingface

Slow dataset loading

Problem: Dataset loading takes too long.Solutions:

Use streaming for large datasets:

dataset = LeRobotDataset(
    "lerobot/aloha_mobile_cabinet",
    streaming=True  # Don't download entire dataset
)

Increase number of workers:

from torch.utils.data import DataLoader

dataloader = DataLoader(
    dataset,
    batch_size=32,
    num_workers=8  # Increase for faster loading
)

Cache dataset locally for repeated use

Corrupted dataset cache

Problem: Inconsistent data or errors after dataset updates.Solution: Clear the dataset cache:

# Remove specific dataset
rm -rf ~/.cache/huggingface/lerobot/<dataset-name>

# Or clear all cached datasets
rm -rf ~/.cache/huggingface/lerobot/*

Training Issues

Training loss not decreasing

Problem: Loss plateaus or doesn’t decrease during training.Solutions:

Check learning rate:

# Try different learning rates
config.training.lr = 1e-4  # Default
config.training.lr = 1e-3  # Higher for faster learning
config.training.lr = 1e-5  # Lower for stability

Verify data normalization:

# Check dataset statistics
print(dataset.meta.stats)

Increase training steps:

lerobot-train \
  --policy=act \
  --dataset.repo_id=lerobot/pusht \
  --training.num_steps=200000  # More steps

Check for data issues (e.g., all actions similar)

NaN or Inf in loss

Problem: Loss becomes NaN or Inf during training.Solutions:

Reduce learning rate:

config.training.lr = 1e-5  # Lower LR

Enable gradient clipping:

config.training.grad_clip_norm = 1.0

Check for numerical instability in custom code
Verify dataset doesn’t contain NaN values:

import torch
batch = next(iter(dataloader))
print("NaN in batch:", torch.isnan(batch['action']).any())

Checkpoint loading errors

Problem: Cannot resume training from checkpoint.Solutions:

Verify checkpoint path:

ls outputs/train/my_checkpoint/
# Should contain: config.yaml, checkpoint_*.pth

Check version compatibility:

# Model from old version may not be compatible
# Try loading with strict=False
policy.load_state_dict(checkpoint, strict=False)

Ensure config matches: The checkpoint config must match your current training config

Robot Hardware Issues

Robot connection failed

Problem: Cannot connect to robot.Solutions:

Check device permissions:

# For USB devices
sudo chmod 666 /dev/ttyUSB0  # Or your device

# Add user to dialout group (permanent)
sudo usermod -a -G dialout $USER
# Log out and back in for changes to take effect

Verify device path:

# List USB devices
ls /dev/tty*

# Use correct path in config
robot = Robot(port="/dev/ttyUSB0")

Check cable connections and power supply

Latency in real-time control

Problem: High latency causes jerky or delayed robot motion.Solutions:

Use GPU inference:

policy = policy.to("cuda")

Enable async inference: See examples/tutorial/async-inf/ for policy server/client pattern
Optimize observation processing:

Reduce image resolution
Use hardware video encoding
Minimize preprocessing steps

Use action chunking (ACT-style policies reduce inference frequency)

Calibration issues

Problem: Robot movements are offset or incorrect.Solutions:

Re-run calibration: Follow your robot’s specific calibration procedure
Check for breaking changes: See Backward Compatibility for migration guides
Verify joint limits in robot config
Test with known-good trajectory to isolate issue

Performance Optimization

Slow training

Solutions:

Use GPU acceleration
Increase batch size (if memory allows)
Use more DataLoader workers:

dataloader = DataLoader(dataset, num_workers=8)

Enable AMP (automatic mixed precision):

config.use_amp = True

Use multi-GPU training with DDP

High memory usage

Solutions:

Reduce batch size
Use gradient checkpointing:

config.use_gradient_checkpointing = True

Clear unused tensors:

del large_tensor
torch.cuda.empty_cache()

Use streaming datasets for large data

Error Messages Reference

'normalize_inputs' not found in state_dict

Cause: Loading a model trained before PR #1452 with new code.Solution: Migrate the model using the normalization migration script:

python src/lerobot/processor/migrate_policy_normalization.py \
    --pretrained-path your/model/path

See Backward Compatibility for details.

'Encoder libsvtav1 not found'

Cause: ffmpeg doesn’t have libsvtav1 encoder compiled.Solution: Install correct ffmpeg version:

conda install ffmpeg=7.1.1 -c conda-forge
ffmpeg -encoders | grep svt  # Verify

'ImportError: cannot import name X'

Cause: Version mismatch between installed LeRobot and code.Solution:

# Reinstall LeRobot
pip uninstall lerobot
pip install lerobot --upgrade

# Or reinstall from source
cd lerobot
pip install -e . --force-reinstall

Getting Help

If you can’t find a solution here:

Search GitHub Issues

Check if your issue has been reported

Ask on Discord

Get help from the community

Open an Issue

Report a new bug with details

Discussions

Ask questions and share ideas

Reporting Bugs

When reporting an issue, please include:

Environment Information

lerobot-info
python --version
nvidia-smi  # If using GPU

Minimal Reproduction

Provide the smallest code snippet that reproduces the issue

Error Traceback

Include the full error message and stack trace

Expected vs Actual

Describe what you expected to happen and what actually happened

The more details you provide, the faster we can help you!

Community

FAQ

Troubleshooting Guide

Installation Issues

GPU and CUDA Issues

Dataset Issues

Training Issues

Robot Hardware Issues

Performance Optimization

Error Messages Reference

Getting Help

Search GitHub Issues

Ask on Discord

Open an Issue

Discussions

Reporting Bugs

Build docs developers (and LLMs) love

Community

FAQ

Documentation Index

​Installation Issues

​GPU and CUDA Issues

​Dataset Issues

​Training Issues

​Robot Hardware Issues

​Performance Optimization

​Error Messages Reference

​Getting Help

Search GitHub Issues

Ask on Discord

Open an Issue

Discussions

​Reporting Bugs

Build docs developers (and LLMs) love

Installation Issues

GPU and CUDA Issues

Dataset Issues

Training Issues

Robot Hardware Issues

Performance Optimization

Error Messages Reference

Getting Help

Reporting Bugs