Documentation Index
Fetch the complete documentation index at: https://mintlify.com/QwenLM/Qwen3-VL/llms.txt
Use this file to discover all available pages before exploring further.
Overview
LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning technique that significantly reduces the number of trainable parameters while maintaining model performance. Instead of fine-tuning all model parameters, LoRA adds small trainable rank decomposition matrices to specific layers.Benefits of LoRA
- Reduced Memory Usage: Train large models with less GPU memory
- Faster Training: Fewer parameters to update means faster iterations
- Smaller Checkpoints: LoRA adapters are typically 1-2% the size of full model weights
- Easy Deployment: Swap different LoRA adapters for different tasks without loading multiple full models
Enabling LoRA
Add LoRA parameters to your training script:LoRA Parameters
Core Parameters
Enable LoRA training. Set to
True to use parameter-efficient fine-tuning.LoRA rank dimension. Controls the bottleneck dimension of the low-rank matrices.Common values:
4: Minimal parameters, fastest training8: Balanced performance (recommended)16: Higher capacity32-64: Approaching full fine-tuning performance
LoRA scaling parameter. Controls the magnitude of LoRA updates.Typical configuration:
- Set to
2 × lora_rfor balanced scaling - Higher values = stronger LoRA influence
Dropout probability for LoRA layers.Recommendations:
0.0: No dropout (default, usually sufficient)0.05-0.1: For larger datasets prone to overfitting
Configuration Examples
Minimal LoRA (Fastest)
Best for: Quick experimentation, limited compute- Smallest adapter size (~0.5-1% of model size)
- Fastest training
- May have slightly lower performance on complex tasks
Balanced LoRA (Recommended)
Best for: Most use cases, good performance-efficiency tradeoff- Moderate adapter size (~1-2% of model size)
- Good training speed
- Strong performance across most tasks
High-Capacity LoRA
Best for: Complex tasks, large datasets, when approaching full fine-tuning performance- Larger adapter size (~4-8% of model size)
- Slower training than lower ranks
- Performance closer to full fine-tuning
- Consider dropout for regularization
Complete LoRA Training Script
train_lora.sh
With LoRA enabled, you can often increase the batch size since fewer parameters are being updated. Try doubling
per_device_train_batch_size compared to full fine-tuning.LoRA with Component Training
LoRA adapters are applied only to the components you choose to train:Learning Rate Adjustment
LoRA training typically benefits from higher learning rates than full fine-tuning:| Training Type | Recommended LR Range |
|---|---|
| Full fine-tuning | 1e-7 to 2e-7 |
| LoRA (r=8) | 1e-4 to 2e-4 |
| LoRA (r=32) | 5e-5 to 1e-4 |
Saving and Loading LoRA Adapters
Checkpoint Structure
LoRA training saves both the adapter weights and the configuration:Loading for Inference
Memory Requirements
Approximate GPU memory usage for different configurations:Qwen2.5-VL-3B
| Configuration | GPUs Required | Memory per GPU |
|---|---|---|
| Full Fine-tuning | 4x | 24GB |
| LoRA (r=8) | 2x | 24GB |
| LoRA (r=8) | 1x | 40GB |
Qwen2.5-VL-7B
| Configuration | GPUs Required | Memory per GPU |
|---|---|---|
| Full Fine-tuning | 8x | 40GB |
| LoRA (r=8) | 4x | 24GB |
| LoRA (r=16) | 4x | 40GB |
Best Practices
Choosing LoRA Rank
Choosing LoRA Rank
Start with
r=8 and adjust based on results:Increase rank if:- Model is underfitting
- Task is complex with many nuances
- Dataset is large and diverse
- Memory is limited
- Training time is critical
- Task is relatively simple
Combining with Other Optimizations
Combining with Other Optimizations
LoRA works well with other efficiency techniques:
Multi-task LoRA Adapters
Multi-task LoRA Adapters
Train separate LoRA adapters for different tasks:Then swap adapters at inference time without reloading the base model.
Troubleshooting
Further Reading
- LoRA: Low-Rank Adaptation of Large Language Models (Paper)
- PEFT Documentation
- Training Configuration for dataset setup
- Training Script Reference for full parameter list