Fine-tuning techniques
We provide tutorials for three main fine-tuning approaches:Supervised fine-tuning (SFT)
Supervised fine-tuning trains models on labeled instruction-response pairs. This is ideal for:- Teaching models new task-specific behaviors
- Adapting models to follow specific instruction formats
- Improving response quality on domain-specific questions
Group relative policy optimization (GRPO)
GRPO is a reinforcement learning technique for tasks with verifiable outputs. Best suited for:- Mathematical problem solving with numeric verification
- Code generation with unit test validation
- Structured output tasks (JSON, SQL) with schema validation
- Question answering with ground truth answers
Continued pre-training (CPT)
Continued pre-training adapts models to new languages, domains, or writing styles. Use CPT for:- Language adaptation and translation tasks
- Domain-specific knowledge injection
- Creative text generation and novel writing
- Adapting to specific narrative styles
Tutorials
SFT with Unsloth
Fine-tune with memory-optimized training using Unsloth for 2x faster training
SFT with TRL
Fine-tune using Hugging Face TRL library with LoRA support
GRPO with Unsloth
Reinforcement learning fine-tuning for mathematical reasoning tasks
GRPO for verifiable tasks
Apply GRPO to tasks with programmatic verification
CPT for translation
Adapt models to new languages with continued pre-training
CPT for text completion
Train models for creative text generation and completion
Vision language model SFT
Fine-tune vision-language models for OCR and visual tasks
Prerequisites
Before starting with fine-tuning, ensure you have:-
GPU access: Fine-tuning requires a GPU. You can use:
- Local GPU (NVIDIA recommended)
- Google Colab with free T4 GPU
- Cloud GPU instances (AWS, GCP, Azure)
- Python environment: Python 3.8 or higher
-
Required libraries: Each tutorial includes specific installation instructions, but common dependencies include:
transformers- Hugging Face Transformers librarytorch- PyTorch deep learning frameworktrlorunsloth- Training libraries (depending on approach)peft- Parameter-efficient fine-tuning
- Hugging Face account: For accessing models and datasets (optional but recommended)
Hardware requirements
Minimum requirements vary by model size and technique:| Model | Method | Minimum GPU | Recommended GPU |
|---|---|---|---|
| LFM2.5-1.2B | SFT with LoRA | 8GB (T4) | 16GB (V100) |
| LFM2.5-1.2B | GRPO | 16GB (T4) | 24GB (L4) |
| LFM2-2.6B | SFT with LoRA | 16GB (T4) | 24GB (A10) |
| LFM2.5-VL-1.6B | VLM SFT | 16GB (T4) | 24GB (L4) |
Deployment after fine-tuning
LFM2.5 models are small and efficient, enabling deployment across a wide range of platforms:| Deployment target | Use case | Documentation |
|---|---|---|
| Android | Mobile apps on Android devices | Android guide |
| iOS | Mobile apps on iPhone/iPad | iOS guide |
| Apple Silicon Mac | Local inference on Mac with MLX | MLX guide |
| llama.cpp | Local deployments on any hardware | llama.cpp guide |
| Ollama | Local inference with easy setup | Ollama guide |
| LM Studio | Desktop app for local inference | LM Studio guide |
| vLLM | Cloud deployments with high throughput | vLLM guide |
| Modal | Serverless cloud deployment | Modal guide |
| Baseten | Production ML infrastructure | Baseten guide |
| Fal | Fast inference API | Fal guide |