Rollout API

Overview

The rollout API manages inference engines, generates model responses, computes rewards, and provides data to the training loop.

Rollout Manager

create_rollout_manager()

Create and initialize the rollout manager with SGLang inference engines.

from slime.ray.placement_group import create_rollout_manager

rollout_manager, num_rollout_per_epoch = create_rollout_manager(args, pg)

args

Namespace

required

Training arguments with rollout configuration

tuple

required

Placement group tuple: (placement_group, bundle_indices, gpu_ids)

Returns

tuple[RolloutManager, int | None]

rollout_manager: Ray actor managing inference engines
num_rollout_per_epoch: Number of rollout steps per epoch (None if --num-rollout is set)

Source: slime/ray/placement_group.py:181

RolloutManager

Ray actor that manages multiple SGLang inference engines for parallel generation.

@ray.remote
class RolloutManager:
    def generate(self, rollout_id: int) -> ray.ObjectRef:
        """Generate rollout data for training"""
    
    def eval(self, rollout_id: int) -> ray.ObjectRef:
        """Run evaluation on eval datasets"""
    
    def onload_weights(self) -> None:
        """Load weights back to GPU from CPU"""
    
    def offload(self) -> None:
        """Offload weights to CPU to free GPU memory"""
    
    def dispose(self) -> None:
        """Cleanup and shutdown inference engines"""

Key Methods:

generate

method

Generate rollout data by sampling prompts and running inference.Parameters:

rollout_id (int): Current rollout step ID

Returns: Ray ObjectRef containing RolloutFnTrainOutput with samples and metricsBehavior:

Fetches prompts from data source
Generates responses via inference engines
Computes rewards using reward function
Applies dynamic filters (if configured)
Returns rollout_batch_size valid samples

eval

method

Run evaluation on configured eval datasets.Parameters:

rollout_id (int): Current rollout step for logging

Returns: Ray ObjectRef containing RolloutFnEvalOutput with evaluation results

Source: Referenced in train.py:17, implementation in slime/ray/rollout.py

Rollout Functions

generate_rollout()

Main rollout generation function that produces training samples.

from slime.rollout.sglang_rollout import generate_rollout

output = generate_rollout(args, rollout_id, data_source, evaluation=False)

args

Namespace

required

Training arguments with rollout parameters

rollout_id

int

required

Current rollout step ID for deterministic sampling

data_source

DataSource

required

Data source for fetching prompts (e.g., RolloutDataSourceWithBuffer)

evaluation

bool

default:"False"

Whether this is an evaluation rollout

Returns

RolloutFnTrainOutput | RolloutFnEvalOutput

For training: RolloutFnTrainOutput containing:

samples: List of sample groups (list[list[Sample]])
metrics: Optional metrics dictionary

For evaluation: RolloutFnEvalOutput containing:

data: Dict mapping dataset names to results
metrics: Optional metrics dictionary

Source: slime/rollout/sglang_rollout.py:563

generate()

Generate a single sample using SGLang inference engine.

async def generate(args: Namespace, sample: Sample, sampling_params: dict) -> Sample:
    """Generate response for a single sample"""

args

Namespace

required

Training arguments

sample

Sample

required

Sample object with prompt and metadata

sampling_params

dict

required

Sampling parameters:

{
    "temperature": 1.0,
    "top_p": 1.0,
    "top_k": -1,
    "max_new_tokens": 512,
    "stop": ["<|endoftext|>"],
    "stop_token_ids": [128001],
    "skip_special_tokens": False
}

Returns

Sample

Updated sample with:

response: Generated text
tokens: Full token sequence (prompt + response)
response_length: Number of generated tokens
rollout_log_probs: Log probabilities for each token
status: Sample.Status enum (COMPLETED, TRUNCATED, ABORTED)

Features:

Supports multi-turn generation via token continuation
Handles multimodal inputs (images, video)
Integrates with RadixTree middleware for prefix caching
Supports partial rollout with loss masking

Source: slime/rollout/sglang_rollout.py:108

Data Structures

RolloutFnTrainOutput

Output type for training rollout functions.

from slime.rollout.base_types import RolloutFnTrainOutput

@dataclass
class RolloutFnTrainOutput:
    samples: list[list[Sample]]  # rollout_batch_size groups of n_samples_per_prompt
    metrics: dict[str, Any] = None  # Optional metrics (e.g., filter stats)

Source: slime/rollout/base_types.py:8

RolloutFnEvalOutput

Output type for evaluation rollout functions.

from slime.rollout.base_types import RolloutFnEvalOutput

@dataclass
class RolloutFnEvalOutput:
    data: dict[str, dict[str, Any]]  # dataset_name -> {"rewards": [...], "samples": [...]}
    metrics: dict[str, Any] = None  # Optional metrics

Example data structure:

{
    "gsm8k": {
        "rewards": [1.0, 0.0, 1.0, ...],
        "truncated": [False, False, True, ...],
        "samples": [sample1, sample2, ...]
    },
    "math": {...}
}

Source: slime/rollout/base_types.py:14

Data Sources

RolloutDataSourceWithBuffer

Data source with buffer support for partial rollout and sample reuse.

from slime.rollout.data_source import RolloutDataSourceWithBuffer

data_source = RolloutDataSourceWithBuffer(args)
samples = data_source.get_samples(num_samples=32)
data_source.add_samples(aborted_samples)  # Add partial samples back to buffer

Key Methods:

get_samples

method

Retrieve sample groups from buffer or dataset.Parameters:

num_samples (int): Number of sample groups to retrieve

Returns: list[list[Sample]] - Sample groupsBehavior:

First tries to get samples from buffer
If buffer insufficient, samples from dataset
Handles epoch transitions and shuffling

add_samples

method

Add sample groups back to buffer (e.g., partial rollout samples).Parameters:

samples (list[list[Sample]]): Sample groups to add

save

method

Save data source state to checkpoint.Parameters:

rollout_id (int): Current rollout ID

load

method

Load data source state from checkpoint.Parameters:

rollout_id (int): Rollout ID to load from

Source: slime/rollout/data_source.py:166

Rollout Configuration

Key Arguments

Inference Engine:

--rollout-num-gpus: Total GPUs for rollout engines
--rollout-num-gpus-per-engine: GPUs per engine (tensor parallel size)
--hf-checkpoint: HuggingFace checkpoint path

Generation Parameters:

--rollout-temperature: Sampling temperature (default: 1.0)
--rollout-top-p: Top-p sampling (default: 1.0)
--rollout-top-k: Top-k sampling (default: -1)
--rollout-max-response-len: Max generation length
--rollout-stop: Stop strings
--rollout-stop-token-ids: Stop token IDs

Data Configuration:

--rollout-batch-size: Samples per rollout step
--n-samples-per-prompt: Responses per prompt
--rollout-shuffle: Shuffle prompts
--rollout-seed: Random seed

Dynamic Sampling:

--over-sampling-batch-size: Granularity for sampling
--dynamic-sampling-filter-path: Filter function path
--partial-rollout: Enable partial rollout recycling

Custom Functions:

--rollout-function-path: Custom rollout function
--custom-generate-function-path: Custom generate function
--custom-rollout-log-function-path: Custom logging function

See Also:

Data Structures API - Sample and batch types
Router API - Inference routing
Arguments API - Complete configuration

Core Modules

Utilities

Overview

Rollout Manager

create_rollout_manager()

RolloutManager

Rollout Functions

generate_rollout()

generate()

Data Structures

RolloutFnTrainOutput

RolloutFnEvalOutput

Data Sources

RolloutDataSourceWithBuffer

Rollout Configuration

Key Arguments

Build docs developers (and LLMs) love

Core Modules

Utilities

Documentation Index

​Overview

​Rollout Manager

​create_rollout_manager()

​RolloutManager

​Rollout Functions

​generate_rollout()

​generate()

​Data Structures

​RolloutFnTrainOutput

​RolloutFnEvalOutput

​Data Sources

​RolloutDataSourceWithBuffer

​Rollout Configuration

​Key Arguments

Build docs developers (and LLMs) love

Overview

Rollout Manager

create_rollout_manager()

RolloutManager

Rollout Functions

generate_rollout()

generate()

Data Structures

RolloutFnTrainOutput

RolloutFnEvalOutput

Data Sources

RolloutDataSourceWithBuffer

Rollout Configuration

Key Arguments