Configuration

Configuration Variables

All configuration is done by editing variables at the top of main.py (lines 14-23). There is no separate configuration file.

API Configuration

Base URL

baseurl

string

default:"http://127.0.0.1:1234/v1"

The API endpoint URL. Default is configured for LM Studio running locally.Examples:

LM Studio: "http://127.0.0.1:1234/v1"
OpenAI-compatible API: "https://api.example.com/v1"

baseurl = "http://127.0.0.1:1234/v1"

Model Selection

llm

string

default:""

The model identifier to use for benchmarks.Special behavior:

Leave empty ("") to automatically use the currently loaded model in LM Studio
Set to a specific model name for other OpenAI-compatible APIs

# Leave empty to select currently loaded model (works with LM-Studio)
llm = ""

# Or specify a model explicitly:
# llm = "gpt-4"
# llm = "qwen2.5-32b-instruct"

Reasoning Settings

reasoning_effort

string

default:"low"

Controls the reasoning effort level for models that support reasoning.Valid values:

"low": Minimal reasoning, faster responses
"medium": Balanced reasoning depth
"high": Maximum reasoning effort, slower but more thoughtful

reasoning_effort = "low"

This setting is passed to the API’s reasoning.effort parameter and recorded in the result file header.

Benchmark Parameters

Number of Tries

tries

int

default:100

Number of test attempts to run for each benchmark.Higher values provide more statistical confidence but take longer to complete.

tries = 100

Timeout

timeout_time

int

default:400

Maximum time to wait for a response in seconds.Prevents the benchmark from hanging if the model enters a “death spiral” generating thousands of tokens.

This feature is not yet implemented (see TODO in source code).

timeout_time = 400  # in seconds

Maximum Tokens

max_tokens

int

default:512

Maximum number of output tokens the model can generate per response.Recommendations:

Default (512): Suitable for most models
Higher values: May be needed for reasoning models that produce longer traces
Lower values: Can prevent verbose or runaway responses

max_tokens = 512 * 1  # might want to increase this with reasoning llms

Environment Setup

LM Studio Configuration

If using LM Studio:

Start LM Studio and load your desired model
Enable the local API server (usually runs on http://127.0.0.1:1234)
Leave llm = "" in the configuration
The currently loaded model will be automatically selected

OpenAI-Compatible API Setup

For other OpenAI-compatible APIs:

Set baseurl to your API endpoint
Set llm to the specific model identifier
Ensure your API supports the extended reasoning format if using reasoning features

baseurl = "https://api.openai.com/v1"
llm = "gpt-4"
reasoning_effort = "medium"

You may need to set API keys via environment variables or modify the OpenAI client initialization in the code, depending on your API provider’s requirements.

Directory Configuration

logs_directory

string

default:"logs"

Directory where log files are stored. Created automatically if it doesn’t exist.

results_directory

string

default:"results"

Directory where result JSON files are stored. Created automatically if it doesn’t exist.

logs_directory = "logs"
results_directory = "results"

Example Configurations

High-Effort Reasoning with More Tries

llm = ""
baseurl = "http://127.0.0.1:1234/v1"
reasoning_effort = "high"
tries = 200
max_tokens = 1024

Quick Testing Configuration

llm = ""
baseurl = "http://127.0.0.1:1234/v1"
reasoning_effort = "low"
tries = 10
max_tokens = 256

Production Benchmarking

llm = "qwen2.5-32b-instruct"
baseurl = "https://api.example.com/v1"
reasoning_effort = "medium"
tries = 500
max_tokens = 768

Get Started

Benchmarks

Usage

API Reference

Configuration

Configuration Variables

API Configuration

Base URL

Model Selection

Reasoning Settings

Benchmark Parameters

Number of Tries

Timeout

Maximum Tokens

Environment Setup

LM Studio Configuration

OpenAI-Compatible API Setup

Directory Configuration

Example Configurations

High-Effort Reasoning with More Tries

Quick Testing Configuration

Production Benchmarking

Build docs developers (and LLMs) love

Get Started

Benchmarks

Usage

API Reference

​Configuration Variables

​API Configuration

​Base URL

​Model Selection

​Reasoning Settings

​Benchmark Parameters

​Number of Tries

​Timeout

​Maximum Tokens

​Environment Setup

​LM Studio Configuration

​OpenAI-Compatible API Setup

​Directory Configuration

​Example Configurations

​High-Effort Reasoning with More Tries

​Quick Testing Configuration

​Production Benchmarking

Build docs developers (and LLMs) love

Configuration Variables

API Configuration

Base URL

Model Selection

Reasoning Settings

Benchmark Parameters

Number of Tries

Timeout

Maximum Tokens

Environment Setup

LM Studio Configuration

OpenAI-Compatible API Setup

Directory Configuration

Example Configurations

High-Effort Reasoning with More Tries

Quick Testing Configuration

Production Benchmarking