Skip to main content

Overview

The GeminiClient is a robust API client for Google’s Gemini 3 model, featuring retry logic with exponential backoff, multi-turn conversation support, and configurable thinking levels for Thought Signatures.

Key Features

  • Temperature fixed at 1.0 (required for Thought Signatures)
  • Retry logic with exponential backoff for rate limit handling
  • Multi-turn conversation history management
  • JSON response parsing with markdown cleanup
  • Configurable thinking levels (low, medium, high)

Class Definition

class GeminiClient:
    """Client for interacting with Gemini 3 API with Thought Signatures."""

    def __init__(self, config: Optional[GeminiConfig] = None):
        """Initialize the Gemini client.

        Args:
            config: Optional GeminiConfig. If not provided, loads from environment.
        """

Constructor

config
Optional[GeminiConfig]
default:"None"
Configuration object for Gemini API. If not provided, configuration is loaded from environment variables using get_config().gemini.

Methods

generate

Generate a response from Gemini with optional system instructions and conversation history.
def generate(
    self,
    prompt: str,
    system_instruction: Optional[str] = None,
    thinking_level: ThinkingLevel = "high",
    use_history: bool = False,
) -> GeminiResponse:
    """Generate a response from Gemini.

    Args:
        prompt: The user prompt.
        system_instruction: Optional system-level instruction.
        thinking_level: Level of thinking (low/medium/high).
        use_history: Whether to include conversation history.

    Returns:
        GeminiResponse with text and optional thinking.

    Raises:
        GeminiError: If all retries fail.
    """

Parameters

prompt
str
required
The user prompt to send to Gemini.
system_instruction
Optional[str]
default:"None"
Optional system-level instruction to guide the model’s behavior.
thinking_level
ThinkingLevel
default:"high"
Level of thinking/reasoning to use. Must be one of: "low", "medium", or "high".
use_history
bool
default:"False"
Whether to include conversation history in the request.

Returns

GeminiResponse
GeminiResponse
Response object containing:
  • text (str): The generated text response
  • thinking (Optional[str]): Optional thinking process output
  • raw_response (Optional[object]): Raw API response object

generate_json

Generate and parse a JSON response from Gemini, with automatic cleanup of markdown code blocks.
def generate_json(
    self,
    prompt: str,
    system_instruction: Optional[str] = None,
    thinking_level: ThinkingLevel = "high",
) -> dict:
    """Generate and parse a JSON response from Gemini.

    Args:
        prompt: The user prompt (should request JSON output).
        system_instruction: Optional system-level instruction.
        thinking_level: Level of thinking (low/medium/high).

    Returns:
        Parsed JSON dict.

    Raises:
        GeminiInvalidResponseError: If response cannot be parsed as JSON.
    """

Parameters

prompt
str
required
The user prompt. A JSON instruction is automatically appended to ensure JSON output.
system_instruction
Optional[str]
default:"None"
Optional system-level instruction.
thinking_level
ThinkingLevel
default:"high"
Level of thinking/reasoning to use.

Returns

result
dict
Parsed JSON response as a Python dictionary. Automatically strips markdown code blocks (json, ) if present.

add_message

Manually add a message to the conversation history.
def add_message(self, role: Literal["user", "model"], content: str):
    """Add a message to the conversation history.

    Args:
        role: The role (user or model).
        content: The message content.
    """

Parameters

role
Literal['user', 'model']
required
The role of the message sender. Must be either "user" or "model".
content
str
required
The message content to add to history.

clear_history

Clear the conversation history.
def clear_history(self):
    """Clear the conversation history."""

get_history_length

Get the number of messages in the conversation history.
def get_history_length(self) -> int:
    """Get the number of messages in conversation history."""

Returns

length
int
The number of messages currently in the conversation history.

Data Classes

GeminiResponse

@dataclass
class GeminiResponse:
    """Response from Gemini API."""

    text: str
    thinking: Optional[str] = None
    raw_response: Optional[object] = None

ConversationMessage

@dataclass
class ConversationMessage:
    """A message in the conversation history."""

    role: Literal["user", "model"]
    content: str

Exceptions

GeminiError

Base exception for all Gemini-related errors.
class GeminiError(Exception):
    """Base exception for Gemini-related errors."""
    pass

GeminiRateLimitError

Raised when API rate limit is exceeded. The client automatically retries with exponential backoff.
class GeminiRateLimitError(GeminiError):
    """Rate limit exceeded."""
    pass

GeminiInvalidResponseError

Raised when the response from Gemini cannot be parsed (e.g., invalid JSON).
class GeminiInvalidResponseError(GeminiError):
    """Invalid or unparseable response from Gemini."""
    pass

Usage Examples

Basic Usage

from src.cognitive.gemini_client import GeminiClient

# Initialize client (loads config from environment)
client = GeminiClient()

# Generate a simple response
response = client.generate("Design an ML experiment for predicting house prices")
print(response.text)

Multi-turn Conversation

# Start a conversation
client = GeminiClient()

# First turn
response1 = client.generate(
    "What preprocessing should I use for a dataset with missing values?",
    use_history=True
)

# Add follow-up manually
client.add_message("user", "What about categorical features?")
response2 = client.generate(
    "How should I encode them?",
    use_history=True
)

# Check history length
print(f"Conversation has {client.get_history_length()} messages")

JSON Generation

# Request structured JSON output
prompt = """
Design an experiment with:
- experiment_name
- model_type
- model_params
"""

try:
    result = client.generate_json(
        prompt=prompt,
        system_instruction="You are an ML experiment designer.",
        thinking_level="high"
    )
    print(f"Model: {result['model_type']}")
    print(f"Params: {result['model_params']}")
except GeminiInvalidResponseError as e:
    print(f"Failed to parse JSON: {e}")

Custom Configuration

from src.config import GeminiConfig

# Create custom config
config = GeminiConfig(
    api_key="your-api-key",
    model="gemini-3-flash-preview",
    temperature=1.0,
    max_retries=5,
    retry_delay=2.0
)

client = GeminiClient(config=config)

Error Handling

from src.cognitive.gemini_client import (
    GeminiClient,
    GeminiError,
    GeminiRateLimitError,
    GeminiInvalidResponseError
)

client = GeminiClient()

try:
    response = client.generate_json("Design an experiment")
except GeminiRateLimitError:
    print("Rate limit exceeded after retries")
except GeminiInvalidResponseError as e:
    print(f"Invalid JSON response: {e}")
except GeminiError as e:
    print(f"Gemini API error: {e}")

Helper Function

create_experiment_designer_prompt

Utility function to create a formatted prompt for experiment design.
def create_experiment_designer_prompt(
    data_profile: dict,
    previous_results: list[dict],
    constraints: Optional[str] = None,
    task_type: str = "regression",
) -> str:
    """Create a prompt for the experiment designer.

    Args:
        data_profile: Dataset profile from DataProfiler.
        previous_results: List of previous experiment results.
        constraints: Optional user constraints.
        task_type: 'classification' or 'regression'.

    Returns:
        Formatted prompt string.
    """

See Also

Build docs developers (and LLMs) love