Documentation Index
Fetch the complete documentation index at: https://mintlify.com/JetBrains/koog/llms.txt
Use this file to discover all available pages before exploring further.
OpenAI provides industry-leading language models through their API. Koog supports all OpenAI models including GPT-4o, GPT-5, o-series reasoning models, and audio-enabled models.
Installation
The OpenAI client is included in the core Koog library. No additional dependencies required.
Quick Start
import ai.koog.prompt.executor.clients.openai.*
import ai.koog.agents.core.*
val executor = simpleOpenAIExecutor(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Chat.GPT4o
)
val agent = AIAgent(
executor = executor,
tools = toolRegistry {
// Your tools here
}
) {
// Define your agent strategy
}
val result = agent.execute("Analyze this data...")
Authentication
API Key Setup
export OPENAI_API_KEY=sk-...
Programmatic Configuration
val client = OpenAILLMClient(
apiKey = "sk-...",
settings = OpenAIClientSettings(
baseUrl = "https://api.openai.com",
timeoutConfig = ConnectionTimeoutConfig(
requestTimeoutMillis = 120_000,
connectTimeoutMillis = 30_000
)
)
)
Available Models
GPT-4o Series (Recommended)
Best all-around models with multimodal capabilities.
OpenAIModels.Chat.GPT4o // 128K context, vision, tools
OpenAIModels.Chat.GPT4oMini // Faster, cheaper variant
Capabilities:
- Text generation
- Vision (images and PDFs)
- Function calling
- Structured outputs
- Multimodal inputs
GPT-5 Series (Latest)
Next-generation models with enhanced reasoning.
OpenAIModels.Chat.GPT5 // 400K context, most capable
OpenAIModels.Chat.GPT5Mini // Faster, cost-effective
OpenAIModels.Chat.GPT5Nano // Ultra-fast, budget-friendly
OpenAIModels.Chat.GPT5Codex // Optimized for code (Responses API only)
OpenAIModels.Chat.GPT5Pro // Most advanced reasoning
o-Series (Reasoning Models)
Models with extended thinking capability.
OpenAIModels.Chat.O1 // Strong reasoning, 200K context
OpenAIModels.Chat.O3 // Balanced reasoning model
OpenAIModels.Chat.O3Mini // Faster reasoning
OpenAIModels.Chat.O4Mini // Vision + reasoning
Use cases:
- Complex problem solving
- Mathematical reasoning
- Code generation and debugging
- Multi-step analysis
Audio Models
Models with audio input/output capabilities.
OpenAIModels.Audio.GptAudio // Audio I/O
OpenAIModels.Audio.GPT4oAudio // GPT-4o with audio
Embedding Models
OpenAIModels.Embeddings.TextEmbedding3Small // 1536 dims, fast
OpenAIModels.Embeddings.TextEmbedding3Large // 3072 dims, best quality
Moderation
OpenAIModels.Moderation.Omni // Text + image moderation
Code Examples
Basic Chat Completion
val executor = simpleOpenAIExecutor(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Chat.GPT4o
)
val result = executor.execute(
prompt = prompt {
user("What is the capital of France?")
}
)
println(result.first().content) // "Paris is the capital of France."
Function Calling
data class WeatherArgs(val location: String, val unit: String = "celsius")
val weatherTool = tool<WeatherArgs, String>(
name = "get_weather",
description = "Get current weather for a location"
) { args ->
"Sunny, 22°C in ${args.location}"
}
val agent = AIAgent(
executor = simpleOpenAIExecutor(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Chat.GPT4o
),
tools = toolRegistry { tool(weatherTool) }
) {
defineGraph<String, String>("weather-agent") {
val response = callLLM()
finish(response)
}
}
val result = agent.execute("What's the weather in Paris?")
Structured Output
@Serializable
data class Person(
val name: String,
val age: Int,
val email: String
)
val executor = simpleOpenAIExecutor(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Chat.GPT4o,
params = OpenAIChatParams(
schema = LLMParams.Schema.JSON.Standard(
name = "Person",
schema = /* JSON schema for Person */
)
)
)
val result = executor.execute(
prompt = prompt {
user("Extract person info: John Doe, 30 years old, john@example.com")
}
)
val person = Json.decodeFromString<Person>(result.first().content)
Vision (Image Analysis)
val executor = simpleOpenAIExecutor(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Chat.GPT4o
)
val result = executor.execute(
prompt = prompt {
user {
text("What's in this image?")
image(
url = "https://example.com/image.jpg"
// or: bytes = imageBytes
)
}
}
)
Streaming Responses
val executor = simpleOpenAIExecutor(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Chat.GPT4o
)
executor.executeStreaming(
prompt = prompt { user("Write a story about AI") }
).collect { frame ->
when (frame) {
is StreamFrame.TextDelta -> print(frame.text)
is StreamFrame.End -> println("\nDone!")
else -> {}
}
}
Embeddings
val client = OpenAILLMClient(
apiKey = System.getenv("OPENAI_API_KEY")
)
val embedding = client.embed(
text = "The quick brown fox jumps over the lazy dog",
model = OpenAIModels.Embeddings.TextEmbedding3Small
)
println("Embedding dimensions: ${embedding.size}") // 1536
Content Moderation
val client = OpenAILLMClient(
apiKey = System.getenv("OPENAI_API_KEY")
)
val result = client.moderate(
prompt = prompt { user("Some potentially harmful content") },
model = OpenAIModels.Moderation.Omni
)
if (result.isHarmful) {
println("Content flagged: ${result.categories}")
}
Advanced Configuration
Custom Parameters
val executor = simpleOpenAIExecutor(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Chat.GPT4o,
params = OpenAIChatParams(
temperature = 0.7,
maxTokens = 2000,
topP = 0.9,
frequencyPenalty = 0.5,
presencePenalty = 0.5,
stop = listOf("\n\n"),
user = "user-123" // For abuse monitoring
)
)
Responses API
For models that support the Responses API (GPT-5 Codex, GPT-5 Pro):
val executor = simpleOpenAIExecutor(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Chat.GPT5Codex,
params = OpenAIResponsesParams(
maxTokens = 4000,
reasoning = OpenAIResponsesParams.Reasoning(
effort = "high"
)
)
)
Azure OpenAI
val client = OpenAILLMClient(
apiKey = System.getenv("AZURE_OPENAI_KEY"),
settings = OpenAIClientSettings(
baseUrl = "https://YOUR-RESOURCE.openai.azure.com",
chatCompletionsPath = "openai/deployments/YOUR-DEPLOYMENT/chat/completions"
)
)
Model Capabilities
| Model | Context | Output | Vision | Audio | Tools | Reasoning |
|---|
| GPT-4o | 128K | 16K | ✅ | ❌ | ✅ | ❌ |
| GPT-4o Mini | 128K | 16K | ✅ | ❌ | ✅ | ❌ |
| GPT-5 | 400K | 128K | ✅ | ❌ | ✅ | ✅ |
| GPT-5 Pro | 400K | 272K | ✅ | ❌ | ✅ | ✅ |
| O3 | 200K | 100K | ✅ | ❌ | ✅ | ✅ |
| GPT Audio | 128K | 16K | ❌ | ✅ | ✅ | ❌ |
Pricing
Pricing varies by model. See OpenAI Pricing for current rates.
Example costs (per 1M tokens):
- GPT-4o: 2.50(input)/10 (output)
- GPT-4o Mini: 0.15(input)/0.60 (output)
- GPT-5: 1.25(input)/10 (output)
- O3: 15(input)/60 (output)
Best Practices
- Use GPT-4o Mini for most tasks - it’s fast and cost-effective
- Reserve GPT-5/O3 for complex reasoning tasks
- Enable streaming for better user experience
- Set max_tokens to control costs
- Use structured outputs for reliable parsing
- Implement retries for production reliability
Troubleshooting
Rate Limits
val client = OpenAILLMClient(
apiKey = System.getenv("OPENAI_API_KEY"),
settings = OpenAIClientSettings(
timeoutConfig = ConnectionTimeoutConfig(
requestTimeoutMillis = 180_000 // 3 minutes
)
)
)
Error Handling
try {
val result = executor.execute(prompt { user("Hello") })
} catch (e: LLMClientException) {
when {
e.message?.contains("rate_limit") == true -> {
// Handle rate limiting
}
e.message?.contains("insufficient_quota") == true -> {
// Handle quota issues
}
else -> throw e
}
}
Resources