verl supports a flexible set of training and inference backends, letting you pick the right combination for your workload — from rapid prototyping on a single node to scaled multi-node production runs. This page covers system requirements, the recommended Docker-based setup, pip installation from source, and a guide to choosing the right backends.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/verl-project/verl/llms.txt
Use this file to discover all available pages before exploring further.
Requirements
verl requires Python >= 3.10 and CUDA >= 12.8. CUDA 12.8 or newer is strongly recommended. Older CUDA versions are not supported by the pre-built images or the stable install path.
| Dependency | Minimum version |
|---|---|
| Python | 3.10 |
| CUDA | 12.8 |
| cuDNN | 9.10.0 |
Choosing Your Backends
Before installing, decide which training and inference backends you need. The choice affects which Docker image or pip extras you select.Training backends
- FSDP / FSDP2 — The recommended backend for research and prototyping. Works with any model supported by Hugging Face Transformers. To use FSDP2, set
strategy=fsdp2in your Hydra config. - Megatron-LM — Recommended when you need maximum scalability across many nodes and GPUs. verl currently supports Megatron-LM v0.13.1. Both backends share the same unified worker layer.
Inference backends
- vLLM — Stable and well-tested (vLLM 0.8.3 and later). Set
VLLM_USE_V1=1for optimal performance. - SGLang — Under extensive development; recommended for advanced multi-turn and agentic features. Refer to the SGLang Backend documentation for detailed setup steps.
- HuggingFace TGI — Suitable for debugging and single-GPU exploration only.
Installation
- Docker (Recommended)
- pip from source
Docker is the fastest and most reliable way to get a fully working verl environment. Starting from v0.6.0, verl publishes application images on top of the official vLLM and SGLang base images. The application images add:
The latest images used in CI are tracked in the GitHub workflows:Dockerfiles
flash_attn- Megatron-LM
- Apex
- TransformerEngine
- DeepEP
| Tag | Inference backend |
|---|---|
verlai/verl:vllm011.latest | vLLM |
verlai/verl:sgl055.latest | SGLang |
Create and start the container
Replace The flags do the following:
<image:tag> with the image you pulled:--runtime=nvidia --gpus all— expose all NVIDIA GPUs to the container--net=host— use host networking (needed for multi-node Ray communication)--shm-size="10g"— increase shared memory for distributed workloads--cap-add=SYS_ADMIN— grant elevated privileges needed by some GPU drivers and profiling tools inside the container-v .:/workspace/verl— mount your local checkout into the container
AMD GPU Support (ROCm)
For AMD MI300 GPUs with the ROCm platform, use the dedicated ROCm Dockerfile: Build the image:-e HOST_UID=$(id -u) and -e HOST_GID=$(id -g) to the launch command.
AMD GPU support currently covers FSDP as the training engine, with vLLM and SGLang as inference engines. Megatron-LM support for AMD is planned for a future release.