Skip to main content

Overview

Slime provides comprehensive logging support through wandb, TensorBoard, and structured console logging.

Logger Configuration

configure_logger()

Configure the Python logging system with slime formatting.
from slime.utils.logging_utils import configure_logger

configure_logger(prefix="")
prefix
str
default:"''"
Optional prefix for log messages
Log Format:
[2024-03-15 14:23:45] train.py:10 - Training started
Behavior:
  • Sets logging level to INFO
  • Configures timestamp and file location formatting
  • Only configures once (subsequent calls are no-ops)
  • Uses force=True to override existing configuration
Source: slime/utils/logging_utils.py:12

Tracking Initialization

init_tracking()

Initialize experiment tracking (wandb/TensorBoard).
from slime.utils.logging_utils import init_tracking

init_tracking(args, primary=True)
args
Namespace
required
Arguments containing tracking configuration:
  • use_wandb: Enable wandb
  • use_tensorboard: Enable TensorBoard
  • wandb_project, wandb_team, wandb_group, etc.
primary
bool
default:"True"
Whether this is the primary tracking process:
  • True: Main training process (initializes wandb run)
  • False: Secondary process (joins existing run)
**kwargs
dict
Additional kwargs passed to wandb.init()
Source: slime/utils/logging_utils.py:27

Logging Metrics

log()

Log metrics to configured tracking backends.
from slime.utils.logging_utils import log

log(
    args,
    metrics={
        "train/loss": 0.234,
        "train/grad_norm": 1.23,
        "rollout_step": 42
    },
    step_key="rollout_step"
)
args
Namespace
required
Arguments with use_wandb and use_tensorboard flags
metrics
dict
required
Dictionary of metric names to values
step_key
str
required
Key in metrics dict to use as step counter (e.g., “rollout_step”, “train_step”)
Behavior:
  • Wandb: Calls wandb.log(metrics) (includes step automatically)
  • TensorBoard: Extracts step from metrics, logs remaining metrics to TensorBoard
Source: slime/utils/logging_utils.py:35

Wandb Configuration

Primary Arguments

--use-wandb
bool
default:"False"
Enable Weights & Biases logging
--wandb-project
str
W&B project name
--wandb-team
str
W&B team/entity name
--wandb-group
str
W&B run group for organizing related runs
--wandb-mode
str
W&B mode: “online”, “offline”, or “disabled”
--wandb-dir
str
Directory to store wandb logs (default: ./wandb)
--wandb-key
str
W&B API key (alternative to WANDB_API_KEY env var)
--wandb-host
str
W&B host URL (for self-hosted instances)
--wandb-run-id
str
W&B run ID (for resuming runs)
--wandb-random-suffix
bool
default:"True"
Add random 6-character suffix to run name
Source: slime/utils/arguments.py:1044-1113

Advanced Wandb Arguments

--wandb-always-use-train-step
bool
default:"False"
Always use train step as x-axis (instead of rollout step for most metrics)
--log-multi-turn
bool
default:"False"
Log statistics for multi-turn rollout
--log-passrate
bool
default:"False"
Log pass@n statistics for responses
--log-reward-category
str
Log reward category statistics from specified metadata key
--log-correct-samples
bool
default:"False"
Log correct samples to wandb
Source: slime/utils/arguments.py:1076-1112

TensorBoard Configuration

--use-tensorboard
bool
default:"False"
Enable TensorBoard logging
--tb-project-name
str
TensorBoard log directory (default: $TENSORBOARD_DIR env var)
--tb-experiment-name
str
TensorBoard experiment name (subdirectory)
Source: slime/utils/arguments.py:1115-1127

Logged Metrics

Training Metrics

Logged during training steps:
{
    "train/loss": 0.234,
    "train/pg_loss": 0.123,
    "train/value_loss": 0.089,
    "train/entropy_loss": 0.022,
    "train/grad_norm": 1.23,
    "train/lr": 1e-6,
    "train/kl": 0.045,
    "train/clipfrac": 0.15,
    "train_step": 100,
    "rollout_step": 42
}

Rollout Metrics

Logged after rollout generation:
{
    "rollout/avg_reward": 0.67,
    "rollout/max_reward": 1.0,
    "rollout/min_reward": 0.0,
    "rollout/avg_response_length": 234.5,
    "rollout/truncation_rate": 0.05,
    "rollout/generation_time": 12.3,
    "rollout/samples_per_second": 45.6,
    "rollout_step": 42
}

Evaluation Metrics

Logged during evaluation:
{
    "eval/gsm8k_accuracy": 0.87,
    "eval/gsm8k_avg_reward": 0.87,
    "eval/math_accuracy": 0.52,
    "eval/math_avg_reward": 0.52,
    "rollout_step": 42
}

Performance Metrics

System performance metrics:
{
    "perf/train_time": 8.5,
    "perf/rollout_time": 12.3,
    "perf/total_time": 20.8,
    "perf/gpu_memory_allocated_gb": 45.2,
    "perf/gpu_memory_reserved_gb": 48.0,
    "rollout_step": 42
}

Custom Logging Functions

Custom Rollout Logging

Provide custom logging for rollout data:
# my_module.py
def log_rollout_data(rollout_id, args, samples, rollout_extra_metrics, rollout_time):
    """Custom rollout logging function.
    
    Args:
        rollout_id: Current rollout step
        args: Training arguments
        samples: List of sample groups
        rollout_extra_metrics: Extra metrics from rollout function
        rollout_time: Time taken for rollout
    
    Returns:
        bool: Whether to skip default logging (True = skip, False = run default)
    """
    import wandb
    
    # Custom metrics
    avg_reward = sum(s[0].reward for s in samples) / len(samples)
    wandb.log({"custom/avg_group_reward": avg_reward})
    
    # Return False to also run default logging
    return False
Configuration:
python train.py \
  --custom-rollout-log-function-path my_module:log_rollout_data
Source: slime/utils/arguments.py:397-404

Custom Eval Logging

Provide custom logging for evaluation:
# my_module.py
def log_eval_rollout_data(rollout_id, args, data, extra_metrics):
    """Custom eval logging function.
    
    Args:
        rollout_id: Current rollout step
        args: Training arguments
        data: Dict mapping dataset names to eval results
        extra_metrics: Extra metrics from eval function
    
    Returns:
        bool: Whether to skip default logging
    """
    import wandb
    
    # Log per-dataset custom metrics
    for dataset_name, results in data.items():
        samples = results["samples"]
        # Custom analysis
        wandb.log({f"custom/{dataset_name}_metric": compute_metric(samples)})
    
    return False
Configuration:
python train.py \
  --custom-eval-rollout-log-function-path my_module:log_eval_rollout_data
Source: slime/utils/arguments.py:406-414

Example Usage

Basic Logging Setup

from slime.utils.logging_utils import configure_logger, init_tracking, log
from slime.utils.arguments import parse_args

# Parse arguments
args = parse_args()

# Configure logging
configure_logger()

# Initialize tracking
init_tracking(args, primary=True)

# Training loop
for rollout_id in range(args.num_rollout):
    # ... training code ...
    
    # Log metrics
    log(args, {
        "train/loss": loss,
        "train/grad_norm": grad_norm,
        "rollout_step": rollout_id
    }, step_key="rollout_step")

Wandb with Grouping

python train.py \
  --use-wandb \
  --wandb-project slime-experiments \
  --wandb-team my-team \
  --wandb-group qwen-32b-experiment \
  --wandb-mode online
Runs will be grouped under “qwen-32b-experiment” in the W&B UI.

Offline Logging

# Log offline (sync later)
python train.py \
  --use-wandb \
  --wandb-mode offline \
  --wandb-dir ./wandb_logs

# Later, sync to cloud
wandb sync ./wandb_logs

Combined Wandb + TensorBoard

python train.py \
  --use-wandb \
  --wandb-project my-project \
  --use-tensorboard \
  --tb-project-name ./tb_logs \
  --tb-experiment-name run-001

Implementation Details

Wandb Utils

Wandb initialization is handled by wandb_utils.py:
# Primary process (main training)
def init_wandb_primary(args, **kwargs):
    wandb.init(
        project=args.wandb_project,
        entity=args.wandb_team,
        group=args.wandb_group,
        config=vars(args),
        **kwargs
    )

# Secondary process (distributed workers)
def init_wandb_secondary(args, **kwargs):
    # Join existing run without creating new
    ...
Source: slime/utils/logging_utils.py:5

TensorBoard Adapter

TensorBoard logging is handled by _TensorboardAdapter:
class _TensorboardAdapter:
    def __init__(self, args):
        self.writer = SummaryWriter(log_dir=...)
    
    def log(self, data, step):
        for key, value in data.items():
            self.writer.add_scalar(key, value, step)
Source: slime/utils/logging_utils.py:6

Best Practices

  1. Always configure logger first:
    configure_logger()  # Before any logging calls
    
  2. Use consistent step keys:
    • Use "rollout_step" for rollout/eval metrics
    • Use "train_step" for training metrics
  3. Group related runs:
    --wandb-group hyperparameter-sweep-001
    
  4. Use offline mode for unstable networks:
    --wandb-mode offline --wandb-dir ./persistent_storage/wandb
    
  5. Log hierarchical metrics:
    {
        "train/loss": ...,
        "train/pg_loss": ...,
        "eval/gsm8k_accuracy": ...,
        "rollout/avg_reward": ...
    }
    
See Also:

Build docs developers (and LLMs) love