When standard environment types don’t fit your use case,Documentation Index
Fetch the complete documentation index at: https://mintlify.com/primeintellect-ai/verifiers/llms.txt
Use this file to discover all available pages before exploring further.
MultiTurnEnv provides full control over the rollout loop. This guide covers advanced patterns for building custom environments with complex interaction logic.
When to Customize
Use custom multi-turn environments when you need:- Complex game logic — Board games, simulations, strategy games
- Non-linear conversations — State-dependent message assembly
- Custom feedback loops — Environment responses based on intermediate state
- Specialized stop conditions — Domain-specific termination logic
- Advanced state management — Complex per-rollout initialization and cleanup
The Rollout Loop
Understanding the rollout loop is essential for customization:Core Methods to Override
env_response(): Required
Defines how the environment responds after each model turn:setup_state(): Optional
Initialize per-rollout resources:get_prompt_messages(): Optional
Customize how messages are assembled for each turn:render_completion(): Optional
Customize how the final conversation is assembled:add_trajectory_step(): Optional
Add metadata to each turn:Stop Conditions
Define when rollouts should terminate using the@vf.stop decorator:
Basic Stop Conditions
has_error— stops ifstate["error"]is setmax_turns_reached— stops aftermax_turnsiterationsprompt_too_long— stops if prompt exceeds model contexthas_final_env_response— stops if early termination signaled
Priority-Based Execution
Control evaluation order with priorities (higher runs first):Early Termination from env_response
Signal completion directly from the environment:state["final_env_response"] triggers the has_final_env_response stop condition.
Resource Management
Cleanup: Per-Rollout
Use@vf.cleanup for per-rollout resource cleanup:
Teardown: Environment Shutdown
Use@vf.teardown for environment-level cleanup:
Error Handling
Verifiers provides structured error handling:Error Hierarchy
Raising Errors
vf.Error is raised:
- Automatically caught by the rollout loop
- Stored in
state["error"] - Built-in
has_errorstop condition triggers - Rollout terminates gracefully
Complete Example: Tic-Tac-Toe
Here’s a complete custom environment:Testing Custom Environments
import pytest
import verifiers as vf
@pytest.mark.asyncio
async def test_env_response():
env = TicTacToeEnv(dataset=dataset, rubric=rubric)
state = {"board": np.zeros((3, 3)), "current_player": 1}
messages = [{"role": "assistant", "content": "<row>0</row><col>0</col>"}]
response = await env.env_response(messages, state)
assert len(response) == 1
assert state["board"][0, 0] == 1
Best Practices
Next Steps
- Evaluation: Comprehensive testing strategies → Evaluation Guide
- Training: Use custom environments for RL → Training Guide
- Integration: Connect to external systems → Tool Environments Guide