Sampling

Sampling allows MCP servers to request LLM completions from the client. This enables agentic behaviors where server-side tools delegate reasoning back to the client’s language model.

How Sampling Works

Server requests completion

The server calls SampleAsync() during tool execution

Request sent to client

The sampling request is sent over the MCP protocol

Client handler invoked

The client’s SamplingHandler processes the request

LLM generates response

The handler forwards to an LLM and returns the result

Server continues

The server receives the completion and continues execution

Configuring the Client

Set up a sampling handler when creating the client:

Using IChatClient (Recommended)

The simplest approach uses CreateSamplingHandler() with any IChatClient implementation:

using Microsoft.Extensions.AI;
using ModelContextProtocol.Client;
using OpenAI;

// Create an LLM client for sampling
IChatClient samplingClient = new OpenAIClient(apiKey)
    .AsChatClient("gpt-4o-mini")
    .AsBuilder()
    .Build();

// Configure sampling handler
var options = new McpClientOptions
{
    Handlers = new()
    {
        SamplingHandler = samplingClient.CreateSamplingHandler()
    }
};

await using var client = await McpClient.CreateAsync(transport, options);

Multiple Providers

using OpenAI;

var samplingClient = new OpenAIClient(apiKey)
    .AsChatClient("gpt-4o-mini");

var options = new McpClientOptions
{
    Handlers = new()
    {
        SamplingHandler = samplingClient.CreateSamplingHandler()
    }
};

Custom Sampling Handler

For full control, implement a custom handler:

var options = new McpClientOptions
{
    Handlers = new()
    {
        SamplingHandler = async (request, progress, cancellationToken) =>
        {
            // Extract the prompt
            string prompt = request?.Messages?.LastOrDefault()?.Content
                .OfType<TextContentBlock>()
                .FirstOrDefault()?.Text ?? string.Empty;

            // Call your LLM service
            string response = await MyLLMService.GenerateAsync(
                prompt,
                maxTokens: request?.MaxTokens ?? 1000,
                temperature: request?.Temperature ?? 1.0f,
                cancellationToken
            );

            // Return MCP result
            return new CreateMessageResult
            {
                Model = "my-model",
                Role = Role.Assistant,
                Content = [new TextContentBlock { Text = response }],
                StopReason = CreateMessageResult.StopReasonEndTurn
            };
        }
    }
};

await using var client = await McpClient.CreateAsync(transport, options);

Complete Example

Here’s a complete example using sampling with an MCP server:

using Microsoft.Extensions.AI;
using Microsoft.Extensions.Logging;
using ModelContextProtocol.Client;
using OpenAI;

using var loggerFactory = LoggerFactory.Create(builder => 
    builder.AddConsole()
);

// Create sampling client (for server to use)
using IChatClient samplingClient = new OpenAIClient(
    Environment.GetEnvironmentVariable("OPENAI_API_KEY")
)
.AsChatClient("gpt-4o-mini")
.AsBuilder()
.Build();

// Connect to MCP server with sampling capability
var mcpClient = await McpClient.CreateAsync(
    new StdioClientTransport(new()
    {
        Command = "npx",
        Arguments = ["-y", "@modelcontextprotocol/server-everything"],
        Name = "Everything"
    }),
    clientOptions: new()
    {
        Handlers = new()
        {
            SamplingHandler = samplingClient.CreateSamplingHandler()
        }
    },
    loggerFactory: loggerFactory
);

// Get available tools
var tools = await mcpClient.ListToolsAsync();
Console.WriteLine($"Tools: {string.Join(", ", tools.Select(t => t.Name))}");

// Create main chat client (for user interaction)
using IChatClient chatClient = new OpenAIClient(
    Environment.GetEnvironmentVariable("OPENAI_API_KEY")
)
.AsChatClient("gpt-4o")
.AsBuilder()
.UseFunctionInvocation()
.Build();

// Chat loop
List<ChatMessage> messages = [];
while (true)
{
    Console.Write("Q: ");
    var input = Console.ReadLine();
    if (string.IsNullOrWhiteSpace(input)) break;

    messages.Add(new(ChatRole.User, input));

    // The chat client calls tools, which may trigger sampling
    var response = await chatClient.GetResponseAsync(
        messages,
        new ChatOptions { Tools = [.. tools] }
    );

    Console.WriteLine($"A: {response.Message}");
    messages.Add(response.Message);
}

Sampling with Progress

Monitor sampling progress using the progress parameter:

var options = new McpClientOptions
{
    Handlers = new()
    {
        SamplingHandler = async (request, progress, cancellationToken) =>
        {
            var chatOptions = new ChatOptions
            {
                MaxOutputTokens = request?.MaxTokens,
                Temperature = request?.Temperature
            };

            var updates = new List<ChatResponseUpdate>();
            await foreach (var update in chatClient.GetStreamingResponseAsync(
                ConvertToMessages(request),
                chatOptions,
                cancellationToken
            ))
            {
                updates.Add(update);
                
                // Report progress if requested
                if (request?.ProgressToken is not null)
                {
                    progress.Report(new ProgressNotificationValue
                    {
                        Progress = updates.Count
                    });
                }
            }

            return ConvertToResult(updates);
        }
    }
};

Server-Side Usage

Servers request sampling during tool execution. Server-side sampling is performed by calling the SampleAsync() method on the McpServer instance.

Capability Negotiation

When a SamplingHandler is configured, the client automatically advertises the sampling capability during initialization:

// Capability is advertised automatically
var options = new McpClientOptions
{
    Handlers = new()
    {
        SamplingHandler = chatClient.CreateSamplingHandler()
    }
    // No need to manually set Capabilities.Sampling
};

Servers can check if sampling is supported before calling SampleAsync(). If not supported, the method throws InvalidOperationException.

Use Cases

Sampling enables powerful agentic behaviors:

Content Summarization

Server tools can ask the LLM to summarize large documents or API responses

Decision Making

Tools can delegate complex decisions to the LLM during execution

Code Generation

Server can request code snippets or transformations from the LLM

Data Analysis

Tools can ask the LLM to analyze and interpret data

Best Practices

Use appropriate models: Use faster, cheaper models (like GPT-4o-mini or Claude Haiku) for sampling to keep costs down.

Avoid infinite loops: Be careful not to create circular dependencies where the sampling LLM calls tools that trigger more sampling.

Progress reporting: Implement progress reporting for long-running sampling requests to provide feedback to users.

Get Started

Core Concepts

Building Servers

Building Clients

Advanced Topics

Examples

How Sampling Works

Configuring the Client

Using IChatClient (Recommended)

Multiple Providers

Custom Sampling Handler

Complete Example

Sampling with Progress

Server-Side Usage

Capability Negotiation

Use Cases

Content Summarization

Decision Making

Code Generation

Data Analysis

Best Practices

Next Steps

Roots

Elicitation

Build docs developers (and LLMs) love

Get Started

Core Concepts

Building Servers

Building Clients

Advanced Topics

Examples

Documentation Index

​How Sampling Works

​Configuring the Client

​Using IChatClient (Recommended)

​Multiple Providers

​Custom Sampling Handler

​Complete Example

​Sampling with Progress

​Server-Side Usage

​Capability Negotiation

​Use Cases

Content Summarization

Decision Making

Code Generation

Data Analysis

​Best Practices

​Next Steps

Roots

Elicitation

Build docs developers (and LLMs) love

How Sampling Works

Configuring the Client

Using IChatClient (Recommended)

Multiple Providers

Custom Sampling Handler

Complete Example

Sampling with Progress

Server-Side Usage

Capability Negotiation

Use Cases

Best Practices

Next Steps