Documentation Index Fetch the complete documentation index at: https://mintlify.com/modelcontextprotocol/csharp-sdk/llms.txt
Use this file to discover all available pages before exploring further.
Sampling allows MCP servers to request LLM completions from the client. This enables agentic behaviors where server-side tools delegate reasoning back to the client’s language model.
How Sampling Works
Server requests completion
The server calls SampleAsync() during tool execution
Request sent to client
The sampling request is sent over the MCP protocol
Client handler invoked
The client’s SamplingHandler processes the request
LLM generates response
The handler forwards to an LLM and returns the result
Server continues
The server receives the completion and continues execution
Configuring the Client
Set up a sampling handler when creating the client:
Using IChatClient (Recommended)
The simplest approach uses CreateSamplingHandler() with any IChatClient implementation:
using Microsoft . Extensions . AI ;
using ModelContextProtocol . Client ;
using OpenAI ;
// Create an LLM client for sampling
IChatClient samplingClient = new OpenAIClient ( apiKey )
. AsChatClient ( "gpt-4o-mini" )
. AsBuilder ()
. Build ();
// Configure sampling handler
var options = new McpClientOptions
{
Handlers = new ()
{
SamplingHandler = samplingClient . CreateSamplingHandler ()
}
};
await using var client = await McpClient . CreateAsync ( transport , options );
Multiple Providers
using OpenAI ;
var samplingClient = new OpenAIClient ( apiKey )
. AsChatClient ( "gpt-4o-mini" );
var options = new McpClientOptions
{
Handlers = new ()
{
SamplingHandler = samplingClient . CreateSamplingHandler ()
}
};
Custom Sampling Handler
For full control, implement a custom handler:
var options = new McpClientOptions
{
Handlers = new ()
{
SamplingHandler = async ( request , progress , cancellationToken ) =>
{
// Extract the prompt
string prompt = request ? . Messages ? . LastOrDefault () ? . Content
. OfType < TextContentBlock >()
. FirstOrDefault () ? . Text ?? string . Empty ;
// Call your LLM service
string response = await MyLLMService . GenerateAsync (
prompt ,
maxTokens : request ? . MaxTokens ?? 1000 ,
temperature : request ? . Temperature ?? 1.0f ,
cancellationToken
);
// Return MCP result
return new CreateMessageResult
{
Model = "my-model" ,
Role = Role . Assistant ,
Content = [ new TextContentBlock { Text = response }],
StopReason = CreateMessageResult . StopReasonEndTurn
};
}
}
};
await using var client = await McpClient . CreateAsync ( transport , options );
Complete Example
Here’s a complete example using sampling with an MCP server:
using Microsoft . Extensions . AI ;
using Microsoft . Extensions . Logging ;
using ModelContextProtocol . Client ;
using OpenAI ;
using var loggerFactory = LoggerFactory . Create ( builder =>
builder . AddConsole ()
);
// Create sampling client (for server to use)
using IChatClient samplingClient = new OpenAIClient (
Environment . GetEnvironmentVariable (" OPENAI_API_KEY ")
)
. AsChatClient (" gpt -4 o - mini ")
. AsBuilder ()
. Build ();
// Connect to MCP server with sampling capability
var mcpClient = await McpClient . CreateAsync (
new StdioClientTransport ( new ()
{
Command = "npx" ,
Arguments = [ "-y" , "@modelcontextprotocol/server-everything" ],
Name = "Everything"
}),
clientOptions : new ()
{
Handlers = new ()
{
SamplingHandler = samplingClient . CreateSamplingHandler ()
}
},
loggerFactory : loggerFactory
);
// Get available tools
var tools = await mcpClient . ListToolsAsync ();
Console . WriteLine ( $"Tools: { string . Join ( ", " , tools . Select ( t => t . Name ))} " );
// Create main chat client (for user interaction)
using IChatClient chatClient = new OpenAIClient (
Environment . GetEnvironmentVariable (" OPENAI_API_KEY ")
)
. AsChatClient (" gpt -4 o ")
. AsBuilder ()
. UseFunctionInvocation ()
. Build ();
// Chat loop
List < ChatMessage > messages = [];
while ( true )
{
Console . Write ( "Q: " );
var input = Console . ReadLine ();
if ( string . IsNullOrWhiteSpace ( input )) break ;
messages . Add ( new ( ChatRole . User , input ));
// The chat client calls tools, which may trigger sampling
var response = await chatClient . GetResponseAsync (
messages ,
new ChatOptions { Tools = [ .. tools ] }
);
Console . WriteLine ( $"A: { response . Message } " );
messages . Add ( response . Message );
}
Sampling with Progress
Monitor sampling progress using the progress parameter:
var options = new McpClientOptions
{
Handlers = new ()
{
SamplingHandler = async ( request , progress , cancellationToken ) =>
{
var chatOptions = new ChatOptions
{
MaxOutputTokens = request ? . MaxTokens ,
Temperature = request ? . Temperature
};
var updates = new List < ChatResponseUpdate >();
await foreach ( var update in chatClient . GetStreamingResponseAsync (
ConvertToMessages ( request ),
chatOptions ,
cancellationToken
))
{
updates . Add ( update );
// Report progress if requested
if ( request ? . ProgressToken is not null )
{
progress . Report ( new ProgressNotificationValue
{
Progress = updates . Count
});
}
}
return ConvertToResult ( updates );
}
}
};
Server-Side Usage
Servers request sampling during tool execution. Server-side sampling is performed by calling the SampleAsync() method on the McpServer instance.
Capability Negotiation
When a SamplingHandler is configured, the client automatically advertises the sampling capability during initialization:
// Capability is advertised automatically
var options = new McpClientOptions
{
Handlers = new ()
{
SamplingHandler = chatClient . CreateSamplingHandler ()
}
// No need to manually set Capabilities.Sampling
};
Servers can check if sampling is supported before calling SampleAsync(). If not supported, the method throws InvalidOperationException.
Use Cases
Sampling enables powerful agentic behaviors:
Content Summarization Server tools can ask the LLM to summarize large documents or API responses
Decision Making Tools can delegate complex decisions to the LLM during execution
Code Generation Server can request code snippets or transformations from the LLM
Data Analysis Tools can ask the LLM to analyze and interpret data
Best Practices
Use appropriate models : Use faster, cheaper models (like GPT-4o-mini or Claude Haiku) for sampling to keep costs down.
Avoid infinite loops : Be careful not to create circular dependencies where the sampling LLM calls tools that trigger more sampling.
Progress reporting : Implement progress reporting for long-running sampling requests to provide feedback to users.
Next Steps
Roots Provide filesystem roots to servers
Elicitation Handle URL elicitation requests