Documentation Index Fetch the complete documentation index at: https://mintlify.com/JetBrains/koog/llms.txt
Use this file to discover all available pages before exploring further.
DashScope provides access to Alibaba Cloud’s Qwen family of models, featuring impressive 1M token context windows, multimodal support (text, image, video, audio), and specialized coding models.
Installation
The DashScope client is included in the core Koog library. No additional dependencies required.
Quick Start
import ai.koog.prompt.executor.clients.dashscope. *
import ai.koog.agents.core. *
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
val agent = AIAgent (
executor = executor,
llmModel = DashscopeModels.QWEN_PLUS,
tools = toolRegistry {
// Your tools here
}
) {
// Define your agent strategy
}
val result = agent. execute ( "Analyze this document..." )
Authentication
API Key Setup
Get your API key from Alibaba Cloud DashScope .
export DASHSCOPE_API_KEY = sk- ...
Programmatic Configuration
// International endpoint (default)
val client = DashscopeLLMClient (
apiKey = "sk-..." ,
settings = DashscopeClientSettings (
baseUrl = "https://dashscope-intl.aliyuncs.com/" ,
timeoutConfig = ConnectionTimeoutConfig (
requestTimeoutMillis = 120_000
)
)
)
// China mainland endpoint
val clientChina = DashscopeLLMClient (
apiKey = "sk-..." ,
settings = DashscopeClientSettings (
baseUrl = "https://dashscope.aliyuncs.com/"
)
)
Available Models
General Chat Models
High-performance models for general tasks.
DashscopeModels.QWEN_FLASH // 1M context, high-speed
DashscopeModels.QWEN_PLUS // 1M context, balanced
DashscopeModels.QWEN_PLUS_LATEST // 1M context, auto-updated
DashscopeModels.QWEN3_MAX // 262K context, most capable
Qwen Flash
1,000,000 token context
32,768 max output tokens
Optimized for speed
Tools and temperature control
Qwen Plus (Qwen3 series)
1,000,000 token context
32,768 max output tokens
Balanced performance and capabilities
Tools, speculation, structured JSON
Multiple choice generation
Qwen Plus Latest (Auto-updating)
Always points to newest Qwen Plus
Same capabilities as Qwen Plus
Automatic updates to latest version
Qwen3 Max (Most Capable)
262,144 token context
65,536 max output tokens
Advanced reasoning
Tools, speculation, structured JSON
Multimodal Models
Models with vision, audio, and video support.
DashscopeModels.QWEN3_OMNI_FLASH // Low-latency omni model
Qwen3 Omni Flash
65,536 token context
16,384 max output tokens
Text, image, video, and audio I/O
Audio/video chat
Visual recognition
Multilingual speech interactions
Coding Models
Specialized models for code generation and software engineering.
DashscopeModels.QWEN3_CODER_PLUS // 1M context, coding agent
DashscopeModels.QWEN3_CODER_FLASH // 1M context, fast coding
Qwen3 Coder Plus
1,000,000 token context
65,536 max output tokens
Coding agent capabilities
Tool use and environment interaction
Retains general abilities
Structured JSON outputs
Qwen3 Coder Flash
1,000,000 token context
32,768 max output tokens
High-speed code generation
Low-latency responses
Tool calling
Code Examples
Basic Chat Completion
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
val result = executor. execute (
model = DashscopeModels.QWEN_PLUS,
prompt = prompt {
system ( "You are a helpful AI assistant." )
user ( "Explain machine learning in simple terms." )
}
)
println (result. first ().content)
Long Context Processing
Leverage the 1M token context for processing large documents:
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
val largeDocument = File ( "large_codebase.txt" ). readText () // Can be very large
val result = executor. execute (
model = DashscopeModels.QWEN_PLUS,
prompt = prompt {
system ( "You are analyzing a large codebase." )
user ( "Document: $largeDocument \n\n Question: What are the main components?" )
}
)
Function Calling
data class SearchArgs ( val query: String , val scope: String )
val searchTool = tool < SearchArgs , String >(
name = "web_search" ,
description = "Search the web for information"
) { args ->
"Search results for ${ args.query } in ${ args.scope } "
}
val agent = AIAgent (
executor = SingleLLMPromptExecutor (
DashscopeLLMClient (System. getenv ( "DASHSCOPE_API_KEY" ))
),
llmModel = DashscopeModels.QWEN_PLUS,
tools = toolRegistry { tool (searchTool) }
) {
defineGraph < String , String >( "search-agent" ) {
val response = callLLM ()
finish (response)
}
}
val result = agent. execute ( "Find recent AI research papers" )
Code Generation
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
val result = executor. execute (
model = DashscopeModels.QWEN3_CODER_PLUS,
prompt = prompt {
user ( "Write a Kotlin function to calculate fibonacci numbers recursively" )
}
)
println (result. first ().content)
Structured Output
@Serializable
data class Analysis (
val summary: String ,
val keyPoints: List < String >,
val sentiment: String
)
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
val result = executor. execute (
model = DashscopeModels.QWEN_PLUS,
params = DashscopeParams (
schema = LLMParams.Schema.JSON. Standard (
name = "Analysis" ,
schema = /* JSON schema */
)
),
prompt = prompt {
user ( "Analyze this text: The AI revolution is transforming industries..." )
}
)
val analysis = Json. decodeFromString < Analysis >(result. first ().content)
Vision - Image Analysis
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
val result = executor. execute (
model = DashscopeModels.QWEN3_OMNI_FLASH,
prompt = prompt {
user {
text ( "What's in this image?" )
image (
url = "https://example.com/photo.jpg"
// or: bytes = imageBytes
)
}
}
)
Video Processing
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
val result = executor. execute (
model = DashscopeModels.QWEN3_OMNI_FLASH,
prompt = prompt {
user {
text ( "Describe what happens in this video" )
video (
url = "https://example.com/video.mp4"
)
}
}
)
Streaming Responses
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
executor. executeStreaming (
model = DashscopeModels.QWEN_PLUS,
prompt = prompt { user ( "Write a detailed essay..." ) }
). collect { frame ->
when (frame) {
is StreamFrame.TextDelta -> print (frame.text)
is StreamFrame.End -> println ( " \n Complete" )
else -> {}
}
}
Advanced Configuration
Custom Parameters
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" )
)
val executor = SingleLLMPromptExecutor (client)
executor. execute (
model = DashscopeModels.QWEN_PLUS,
params = DashscopeParams (
temperature = 0.7 ,
maxTokens = 8192 ,
topP = 0.9 ,
presencePenalty = 0.5 ,
frequencyPenalty = 0.5 ,
enableSearch = true , // Enable web search
enableThinking = true , // Enable reasoning display
parallelToolCalls = true
),
prompt = prompt { user ( "Research recent developments in AI" ) }
)
Web Search Integration
Enable real-time web search for up-to-date information:
executor. execute (
model = DashscopeModels.QWEN_PLUS,
params = DashscopeParams (
enableSearch = true
),
prompt = prompt { user ( "What are today's news headlines?" ) }
)
Reasoning Display
Show the model’s thinking process:
executor. execute (
model = DashscopeModels.QWEN3_MAX,
params = DashscopeParams (
enableThinking = true
),
prompt = prompt { user ( "Solve this complex problem step by step..." ) }
)
executor. execute (
model = DashscopeModels.QWEN_PLUS,
params = DashscopeParams (
toolChoice = LLMParams.ToolChoice.Required // Auto, None, Required, Named
),
prompt = prompt { user ( "Search for information" ) }
)
Model Capabilities
Model Context Output Vision Audio/Video Tools Structured JSON Qwen Flash 1M 32K ❌ ❌ ✅ ❌ Qwen Plus 1M 32K ❌ ❌ ✅ ✅ Qwen Plus Latest 1M 32K ❌ ❌ ✅ ✅ Qwen3 Max 262K 65K ❌ ❌ ✅ ✅ Qwen3 Omni Flash 65K 16K ✅ ✅ ✅ ❌ Qwen3 Coder Plus 1M 65K ❌ ❌ ✅ ✅ Qwen3 Coder Flash 1M 32K ❌ ❌ ✅ ❌
Pricing
Pricing varies by model and region. See Alibaba Cloud Pricing for current rates.
Best Practices
Use Qwen Plus for most tasks - excellent balance of capability and performance
Use Qwen Flash for high-throughput, latency-sensitive applications
Use Qwen3 Max for complex reasoning requiring advanced capabilities
Use Qwen3 Coder Plus for software engineering and coding agents
Leverage 1M context for processing entire codebases or large documents
Use Qwen3 Omni Flash for multimodal applications with audio/video
Enable search for real-time information retrieval
Use Qwen Plus Latest to automatically benefit from model improvements
Use Cases
Use Qwen Plus or Qwen3 Coder Plus with 1M token context to process entire books, codebases, or datasets in a single request.
Use Qwen3 Coder Plus for coding agents that can see and understand entire projects, perform multi-file edits, and assist with complex refactoring.
Use Qwen3 Omni Flash for applications requiring text, image, video, and audio understanding - perfect for content analysis and interactive chat.
Use Qwen Plus with enableSearch = true for applications requiring up-to-date information from the web.
Use Qwen Flash or Qwen3 Coder Flash for applications requiring fast responses with minimal latency.
Use Qwen3 Max with enableThinking = true for problems requiring advanced reasoning and step-by-step analysis.
Limitations
No embeddings API : Use OpenAI or other providers for embeddings
No moderation API : Implement custom content filtering
Regional availability : Some features may vary between international and China endpoints
Model availability : Some models may require specific API access levels
Troubleshooting
Rate Limits
val client = DashscopeLLMClient (
apiKey = System. getenv ( "DASHSCOPE_API_KEY" ),
settings = DashscopeClientSettings (
timeoutConfig = ConnectionTimeoutConfig (
requestTimeoutMillis = 300_000 // 5 minutes for large context
)
)
)
Endpoint Selection
// If experiencing connectivity issues, try the appropriate endpoint:
// International users
val clientIntl = DashscopeLLMClient (
apiKey = apiKey,
settings = DashscopeClientSettings (
baseUrl = "https://dashscope-intl.aliyuncs.com/"
)
)
// China mainland users
val clientChina = DashscopeLLMClient (
apiKey = apiKey,
settings = DashscopeClientSettings (
baseUrl = "https://dashscope.aliyuncs.com/"
)
)
Error Handling
try {
val result = executor. execute (
model = DashscopeModels.QWEN_PLUS,
prompt = prompt { user ( "Hello" ) }
)
} catch (e: LLMClientException ) {
when {
e.message?. contains ( "rate_limit" ) == true -> {
// Handle rate limiting
}
e.message?. contains ( "invalid_api_key" ) == true -> {
// Check API key configuration
}
else -> throw e
}
}
Resources