Installation

slime supports multiple installation methods to accommodate different deployment scenarios. We strongly recommend using Docker for the easiest setup experience.

Hardware Requirements

slime supports multiple NVIDIA GPU hardware platforms:

H-Series (H100/H200): Official support with comprehensive CI testing and stable performance
B200 Series: Fully supported with identical setup steps as H-series GPUs

Latest Docker images are compatible with both B-series and H-series GPUs without additional configuration.

H-series GPUs have CI protection and are thoroughly validated - recommended for production
B-series basic functionality is stable and suitable for development/testing, but currently lacks CI protection

Docker Installation (Recommended)

# Pull the latest image
docker pull slimerl/slime:latest

# Start the container
docker run --rm --gpus all --ipc=host --shm-size=16g \
  --ulimit memlock=-1 --ulimit stack=67108864 \
  -it slimerl/slime:latest /bin/bash

Docker Benefits

Pre-configured Environment

All dependencies including SGLang and Megatron patches pre-installed

Avoid Conflicts

Isolated environment prevents version conflicts with system packages

Quick Setup

Get started in minutes without manual dependency installation

Consistent Behavior

Guaranteed compatibility across different host systems

Conda Installation

Conda installation requires manual setup of SGLang and Megatron. Use Docker if possible to avoid potential configuration issues.

For scenarios where Docker is not convenient, you can install using conda:

Review the build script

The official build script provides a reference for conda installation:

cat /path/to/slime/build_conda.sh

This script includes all necessary steps to set up SGLang, Megatron, and other dependencies. You may need to adjust paths and versions for your environment.

Create conda environment

Create a new conda environment with Python 3.10+:

conda create -n slime python=3.10
conda activate slime

Install PyTorch

Install PyTorch with CUDA support:

conda install pytorch pytorch-cuda=12.1 -c pytorch -c nvidia

Install slime

Clone and install slime:

git clone https://github.com/THUDM/slime.git
cd slime
pip install -e .

Set up Megatron

Clone Megatron-LM and add to PYTHONPATH:

git clone https://github.com/NVIDIA/Megatron-LM.git
export PYTHONPATH=/path/to/Megatron-LM:$PYTHONPATH

Add the PYTHONPATH export to your .bashrc or .zshrc for persistence.

Install SGLang

Install SGLang according to your requirements:

pip install "sglang[all]"

Multi-Node Setup

For large-scale training with multiple nodes, you need to set up a Ray cluster.

Start Ray head node

On the first node (node 0), start the Ray head:

ray start --head --node-ip-address ${MASTER_ADDR} \
  --num-gpus 8 --disable-usage-stats

Replace ${MASTER_ADDR} with the IP address of node 0.

Connect worker nodes

On all other nodes, connect to the head:

ray start --address=${MASTER_ADDR}:6379 --num-gpus 8

Submit training job

From node 0, submit your training job:

ray job submit --address="http://127.0.0.1:8265" \
   --runtime-env-json='{
     "env_vars": {
        "PYTHONPATH": "/root/Megatron-LM/"
     }
   }' \
   -- python3 train.py \
   --actor-num-nodes 2 \
   --actor-num-gpus-per-node 8 \
   --rollout-num-gpus 16 \
   # ... other arguments

Network Configuration

In complex network environments (Docker, SLURM), you may need to specify network interfaces:

export SLIME_HOST_IP=$(hostname -I | awk '{print $1}')
export GLOO_SOCKET_IFNAME=$(ip -o -4 addr show | awk '$4 ~ /^10\./ {print $2}')
export NCCL_SOCKET_IFNAME=$(ip -o -4 addr show | awk '$4 ~ /^10\./ {print $2}')
export NVSHMEM_BOOTSTRAP_UID_SOCK_IFNAME=$(ip -o -4 addr show | awk '$4 ~ /^10\./ {print $2}')

AMD GPU Support

slime also supports AMD GPUs. For installation instructions specific to AMD hardware:

AMD Usage Tutorial

View the complete AMD setup guide

Verify Installation

After installation, verify that slime is working correctly:

python -c "import slime; print(slime.__version__)"

Development Setup

If you plan to contribute to slime, set up pre-commit hooks:

apt install pre-commit -y
pre-commit install

# Run pre-commit to ensure code style consistency
pre-commit run --all-files --show-diff-on-failure --color=always

Troubleshooting

CUDA out of memory during co-located training

When running training and inference on the same GPUs, reduce SGLang’s memory usage:

--sglang-mem-fraction-static 0.8

Megatron will offload after initialization to free memory for SGLang.

Wrong network interface selected

Explicitly set network interfaces using environment variables:

export GLOO_SOCKET_IFNAME=eth0
export NCCL_SOCKET_IFNAME=eth0

Megatron checkpoint conversion fails

Ensure you’re using the correct model configuration:

source scripts/models/your-model.sh
# Verify MODEL_ARGS are loaded
echo ${MODEL_ARGS[@]}

For models with custom vocab sizes, manually set --vocab-size during conversion.

Ray cluster connection issues

Verify the head node is accessible:

ray status --address=${MASTER_ADDR}:6379

Check firewall rules allow ports 6379 (Redis) and 8265 (Dashboard).

Next Steps

Quick Start

Run your first training job

Usage Guide

Learn about configuration and parameters

Get Started

Core Concepts

Guides

Advanced

Platform Support

Installation

Hardware Requirements

Docker Installation (Recommended)

Docker Benefits

Pre-configured Environment

Avoid Conflicts

Quick Setup

Consistent Behavior

Conda Installation

Multi-Node Setup

Network Configuration

AMD GPU Support

AMD Usage Tutorial

Verify Installation

Development Setup

Troubleshooting

Next Steps

Quick Start

Usage Guide

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Advanced

Platform Support

Documentation Index

​Hardware Requirements

​Docker Installation (Recommended)

​Docker Benefits

Pre-configured Environment

Avoid Conflicts

Quick Setup

Consistent Behavior

​Conda Installation

​Multi-Node Setup

​Network Configuration

​AMD GPU Support

AMD Usage Tutorial

​Verify Installation

​Development Setup

​Troubleshooting

​Next Steps

Quick Start

Usage Guide

Build docs developers (and LLMs) love

Hardware Requirements

Docker Installation (Recommended)

Docker Benefits

Conda Installation

Multi-Node Setup

Network Configuration

AMD GPU Support

Verify Installation

Development Setup

Troubleshooting

Next Steps