Multi-Turn Conversations

Chats enable multi-turn conversations where the model remembers previous messages and can build on the conversation context. This is perfect for building chatbots, assistants, and interactive applications.

Creating a Chat Session

Create a chat session using client.chats.create() to start a conversation:

chat = client.chats.create(model='gemini-2.5-flash')

Synchronous Non-Streaming

Send messages and receive complete responses:

chat = client.chats.create(model='gemini-2.5-flash')
response = chat.send_message('tell me a story')
print(response.text)

response = chat.send_message('summarize the story you told me in 1 sentence')
print(response.text)

Each call to send_message() retains the conversation context, so the model can reference previous messages.

Synchronous Streaming

Stream responses as they’re generated for real-time user experiences:

chat = client.chats.create(model='gemini-2.5-flash')
for chunk in chat.send_message_stream('tell me a story'):
    print(chunk.text, end='')

This is ideal for displaying responses progressively in chat interfaces.

Asynchronous Non-Streaming

Use async/await for non-blocking chat operations:

import asyncio

async def chat_example():
    chat = client.aio.chats.create(model='gemini-2.5-flash')
    response = await chat.send_message('tell me a story')
    print(response.text)
    
    response = await chat.send_message('summarize the story you told me in 1 sentence')
    print(response.text)

asyncio.run(chat_example())

Asynchronous Streaming

Combine async operations with streaming for the best performance:

import asyncio

async def chat_stream_example():
    chat = client.aio.chats.create(model='gemini-2.5-flash')
    async for chunk in await chat.send_message_stream('tell me a story'):
        print(chunk.text, end='')

asyncio.run(chat_stream_example())

Context Retention

Chat sessions automatically maintain conversation history. The model can:

Reference previous messages
Build on earlier responses
Maintain consistency across turns
Use pronouns and context from prior exchanges

chat = client.chats.create(model='gemini-2.5-flash')

# First turn
response = chat.send_message('My name is Alice and I love hiking.')
print(response.text)  # Model acknowledges the information

# Second turn - model remembers your name
response = chat.send_message('What was my name again?')
print(response.text)  # "Your name is Alice"

# Third turn - model remembers your interests
response = chat.send_message('Recommend an activity for me.')
print(response.text)  # Model suggests hiking-related activities

Configuration Options

You can configure chat sessions with the same options as generate_content:

from google.genai import types

chat = client.chats.create(
    model='gemini-2.5-flash',
    config=types.GenerateContentConfig(
        temperature=0.7,
        max_output_tokens=1024,
        system_instruction='You are a helpful assistant that speaks concisely.',
    )
)

response = chat.send_message('tell me about Python')
print(response.text)

Accessing Chat History

You can access the full conversation history from a chat session:

chat = client.chats.create(model='gemini-2.5-flash')
chat.send_message('Hello!')
chat.send_message('How are you?')

# Access the conversation history
for message in chat.history:
    print(f"{message.role}: {message.parts[0].text}")

Best Practices

Use streaming for better user experience in chat interfaces
Use async for handling multiple concurrent chat sessions
Set system instructions to define the assistant’s personality and behavior
Monitor context length - long conversations may exceed token limits
Create new sessions when starting a new topic or conversation

Common Patterns

Chat with History Clearing

chat = client.chats.create(model='gemini-2.5-flash')

# Have a conversation
chat.send_message('Remember this number: 42')
chat.send_message('What number did I tell you?')  # Returns 42

# Start fresh by creating a new chat session
chat = client.chats.create(model='gemini-2.5-flash')
chat.send_message('What number did I tell you?')  # No memory of 42

Chat with Function Calling

from google.genai import types

def get_weather(location: str) -> str:
    """Get the current weather for a location."""
    return "sunny"

chat = client.chats.create(
    model='gemini-2.5-flash',
    config=types.GenerateContentConfig(
        tools=[get_weather],
    )
)

response = chat.send_message('What is the weather in Boston?')
print(response.text)

response = chat.send_message('How about in New York?')
print(response.text)

The chat session maintains context while supporting function calling across multiple turns.

Get Started

Core Concepts

Content Generation

Advanced Features

Media Generation

Files & Embeddings

Fine-tuning & Batch

Configuration

Creating a Chat Session

Synchronous Non-Streaming

Synchronous Streaming

Asynchronous Non-Streaming

Asynchronous Streaming

Context Retention

Configuration Options

Accessing Chat History

Best Practices

Common Patterns

Chat with History Clearing

Chat with Function Calling

Build docs developers (and LLMs) love

Get Started

Core Concepts

Content Generation

Advanced Features

Media Generation

Files & Embeddings

Fine-tuning & Batch

Configuration

Documentation Index

​Creating a Chat Session

​Synchronous Non-Streaming

​Synchronous Streaming

​Asynchronous Non-Streaming

​Asynchronous Streaming

​Context Retention

​Configuration Options

​Accessing Chat History

​Best Practices

​Common Patterns

​Chat with History Clearing

​Chat with Function Calling

Build docs developers (and LLMs) love

Creating a Chat Session

Synchronous Non-Streaming

Synchronous Streaming

Asynchronous Non-Streaming

Asynchronous Streaming

Context Retention

Configuration Options

Accessing Chat History

Best Practices

Common Patterns

Chat with History Clearing

Chat with Function Calling