Documentation Index
Fetch the complete documentation index at: https://mintlify.com/firebase/genkit/llms.txt
Use this file to discover all available pages before exploring further.
Genkit provides a unified API for working with AI models from different providers. Whether you’re using Gemini, Claude, GPT, Llama, or any other model, the interface is the same.
Model Abstraction
Genkit abstracts away provider-specific APIs into a single, consistent interface:
// Same API works for any model
const response = await ai.generate({
model: 'googleai/gemini-2.0-flash', // or anthropic/claude-3-5-sonnet
prompt: 'Explain quantum computing',
});
This abstraction means:
- Switch providers easily: Change one line to try different models
- Multi-model workflows: Use different models for different tasks
- Consistent error handling: Same error types across providers
- Unified tracing: All model calls appear the same in traces
Model References
Models are referenced by a namespace/name format:
[plugin-namespace]/[model-name]
Examples:
googleai/gemini-2.0-flash
anthropic/claude-3-5-sonnet
ollama/llama2
vertexai/gemini-1.5-pro
import { genkit } from 'genkit';
import { googleAI } from '@genkit-ai/google-genai';
import { anthropic } from '@genkit-ai/anthropic';
const ai = genkit({
plugins: [googleAI(), anthropic()],
});
// Use Gemini
const geminiResponse = await ai.generate({
model: googleAI.model('gemini-2.0-flash'),
prompt: 'Write a haiku',
});
// Use Claude
const claudeResponse = await ai.generate({
model: anthropic.model('claude-3-5-sonnet'),
prompt: 'Write a haiku',
});
Generating Content
Basic Text Generation
const { text } = await ai.generate({
model: googleAI.model('gemini-2.0-flash'),
prompt: 'Explain REST APIs in simple terms',
});
console.log(text);
Structured Output
Request JSON output that matches a schema:
import { z } from 'genkit';
const RecipeSchema = z.object({
name: z.string(),
ingredients: z.array(z.string()),
steps: z.array(z.string()),
prepTime: z.string(),
});
const { output } = await ai.generate({
model: googleAI.model('gemini-2.0-flash'),
prompt: 'Create a recipe for chocolate chip cookies',
output: { schema: RecipeSchema },
});
console.log(output.name); // Typed!
console.log(output.ingredients); // Typed!
Send images, audio, and video to multimodal models:
import { Media } from 'genkit';
const { text } = await ai.generate({
model: googleAI.model('gemini-2.0-flash'),
prompt: [
{ text: 'What is in this image?' },
{ media: { url: 'https://example.com/image.jpg' } },
],
});
Model Configuration
Configure model behavior with parameters:
const { text } = await ai.generate({
model: googleAI.model('gemini-2.0-flash'),
prompt: 'Write a creative story',
config: {
temperature: 1.2, // Higher = more creative
topK: 40, // Consider top 40 tokens
topP: 0.95, // Nucleus sampling threshold
maxOutputTokens: 1000, // Limit response length
},
});
Default Configuration
Set defaults at the Genkit level:
const ai = genkit({
plugins: [googleAI()],
model: googleAI.model('gemini-2.0-flash', {
temperature: 0.7,
topK: 40,
}),
});
// Uses default config
const response = await ai.generate({
prompt: 'Hello!',
});
Models can call functions (tools) to extend their capabilities:
const getWeatherTool = ai.defineTool(
{
name: 'getWeather',
description: 'Get current weather for a city',
inputSchema: z.object({ city: z.string() }),
outputSchema: z.string(),
},
async ({ city }) => {
// Call weather API...
return `Weather in ${city}: Sunny, 72°F`;
}
);
const { text } = await ai.generate({
model: googleAI.model('gemini-2.0-flash'),
prompt: 'What is the weather in Paris?',
tools: [getWeatherTool],
});
// Model decides to call getWeather, gets result, and responds:
// "The weather in Paris is currently sunny with a temperature of 72°F."
Streaming Responses
Stream responses as they’re generated:
const { stream, response } = ai.generateStream({
model: googleAI.model('gemini-2.0-flash'),
prompt: 'Write a long story about space exploration',
});
// Stream chunks as they arrive
for await (const chunk of stream) {
console.log(chunk.text);
}
// Or wait for the complete response
const final = await response;
console.log(final.text);
Available Model Providers
Official Providers
| Provider | Plugin | Models |
|---|
| Google AI | @genkit-ai/google-genai (JS)
genkit.plugins.google_genai (Python) | Gemini 2.0 Flash, Gemini 1.5 Pro, Imagen, Veo |
| Anthropic | @genkit-ai/anthropic
genkit.plugins.anthropic | Claude 3.5 Sonnet, Claude 3 Opus |
| Vertex AI | @genkit-ai/vertexai
genkit.plugins.vertex_ai | Model Garden (1000+ models) |
| Ollama | @genkit-ai/ollama
genkit.plugins.ollama | Llama, Mistral, CodeLlama (local) |
| OpenAI-compatible | @genkit-ai/compat-oai
genkit.plugins.compat_oai | Any OpenAI-compatible API |
- Amazon Bedrock: Claude, Llama, Titan models
- Mistral AI: Mistral, Mixtral models
- Cohere: Command models + reranking
- DeepSeek: DeepSeek models
- xAI: Grok models
- HuggingFace: Inference API models
- Cloudflare Workers AI: Edge AI models
- Azure AI Foundry: 11,000+ models
Model Middleware
Add behavior to model calls with middleware:
import { retry } from 'genkit/model/middleware';
const { text } = await ai.generate({
model: googleAI.model('gemini-2.0-flash'),
prompt: 'Hello!',
use: [
retry({
maxRetries: 3,
initialDelayMs: 1000,
backoffFactor: 2,
}),
],
});
Common middleware:
- Retry: Automatic retry with exponential backoff
- Caching: Cache responses for identical requests
- Safety: Filter harmful content
- Logging: Log all requests/responses
- Custom: Build your own
All responses include metadata:
const response = await ai.generate({
model: googleAI.model('gemini-2.0-flash'),
prompt: 'Hello!',
});
console.log(response.text); // Generated text
console.log(response.usage); // Token usage stats
console.log(response.finishReason); // Why generation stopped
console.log(response.latencyMs); // Request duration
console.log(response.custom); // Provider-specific metadata
Next Steps
- Learn about Prompts - managing prompt templates
- Explore Tools - extending models with functions
- See Flows - building multi-step AI workflows