Overview
The rollout API manages inference engines, generates model responses, computes rewards, and provides data to the training loop.Rollout Manager
create_rollout_manager()
Create and initialize the rollout manager with SGLang inference engines.Training arguments with rollout configuration
Placement group tuple: (placement_group, bundle_indices, gpu_ids)
rollout_manager: Ray actor managing inference enginesnum_rollout_per_epoch: Number of rollout steps per epoch (None if--num-rolloutis set)
slime/ray/placement_group.py:181
RolloutManager
Ray actor that manages multiple SGLang inference engines for parallel generation.Generate rollout data by sampling prompts and running inference.Parameters:
rollout_id(int): Current rollout step ID
- Fetches prompts from data source
- Generates responses via inference engines
- Computes rewards using reward function
- Applies dynamic filters (if configured)
- Returns
rollout_batch_sizevalid samples
Run evaluation on configured eval datasets.Parameters:
rollout_id(int): Current rollout step for logging
train.py:17, implementation in slime/ray/rollout.py
Rollout Functions
generate_rollout()
Main rollout generation function that produces training samples.Training arguments with rollout parameters
Current rollout step ID for deterministic sampling
Data source for fetching prompts (e.g., RolloutDataSourceWithBuffer)
Whether this is an evaluation rollout
For training: RolloutFnTrainOutput containing:
samples: List of sample groups (list[list[Sample]])metrics: Optional metrics dictionary
data: Dict mapping dataset names to resultsmetrics: Optional metrics dictionary
slime/rollout/sglang_rollout.py:563
generate()
Generate a single sample using SGLang inference engine.Training arguments
Sample object with prompt and metadata
Sampling parameters:
Updated sample with:
response: Generated texttokens: Full token sequence (prompt + response)response_length: Number of generated tokensrollout_log_probs: Log probabilities for each tokenstatus: Sample.Status enum (COMPLETED, TRUNCATED, ABORTED)
- Supports multi-turn generation via token continuation
- Handles multimodal inputs (images, video)
- Integrates with RadixTree middleware for prefix caching
- Supports partial rollout with loss masking
slime/rollout/sglang_rollout.py:108
Data Structures
RolloutFnTrainOutput
Output type for training rollout functions.slime/rollout/base_types.py:8
RolloutFnEvalOutput
Output type for evaluation rollout functions.slime/rollout/base_types.py:14
Data Sources
RolloutDataSourceWithBuffer
Data source with buffer support for partial rollout and sample reuse.Retrieve sample groups from buffer or dataset.Parameters:
num_samples(int): Number of sample groups to retrieve
- First tries to get samples from buffer
- If buffer insufficient, samples from dataset
- Handles epoch transitions and shuffling
Add sample groups back to buffer (e.g., partial rollout samples).Parameters:
samples(list[list[Sample]]): Sample groups to add
Save data source state to checkpoint.Parameters:
rollout_id(int): Current rollout ID
Load data source state from checkpoint.Parameters:
rollout_id(int): Rollout ID to load from
slime/rollout/data_source.py:166
Rollout Configuration
Key Arguments
Inference Engine:--rollout-num-gpus: Total GPUs for rollout engines--rollout-num-gpus-per-engine: GPUs per engine (tensor parallel size)--hf-checkpoint: HuggingFace checkpoint path
--rollout-temperature: Sampling temperature (default: 1.0)--rollout-top-p: Top-p sampling (default: 1.0)--rollout-top-k: Top-k sampling (default: -1)--rollout-max-response-len: Max generation length--rollout-stop: Stop strings--rollout-stop-token-ids: Stop token IDs
--rollout-batch-size: Samples per rollout step--n-samples-per-prompt: Responses per prompt--rollout-shuffle: Shuffle prompts--rollout-seed: Random seed
--over-sampling-batch-size: Granularity for sampling--dynamic-sampling-filter-path: Filter function path--partial-rollout: Enable partial rollout recycling
--rollout-function-path: Custom rollout function--custom-generate-function-path: Custom generate function--custom-rollout-log-function-path: Custom logging function
- Data Structures API - Sample and batch types
- Router API - Inference routing
- Arguments API - Complete configuration