Use this file to discover all available pages before exploring further.
FastAPI is a high-performance Python web framework built on Starlette and Pydantic. It combines native async/await support with automatic request validation and interactive API documentation out of the box. For AI agents, this means you can expose synchronous responses and streaming token output from the same application with minimal boilerplate.
Sync + streaming
Serve both blocking responses and real-time token streams from the same app.
Automatic docs
FastAPI generates /docs and /redoc pages from your code with no extra configuration.
Type validation
Pydantic models validate every incoming request before it reaches your agent.
Start with any agent implementation. The example below uses a simple class with both a synchronous method and an async generator—the same pattern works with LangChain, LangGraph, or any other framework:
import asyncioclass SimpleAgent: def __init__(self, name="FastAPI Agent"): self.name = name def generate_response(self, query: str) -> str: """Generate a synchronous response to a user query.""" return ( f"Agent {self.name} received: '{query}'\n" "Response: This is a simulated agent response." ) async def generate_response_stream(self, query: str): """Generate a streaming response token by token.""" prefix = f"Agent {self.name} thinking about: '{query}'\n" response = "This is a simulated agent response that streams token by token." yield prefix for token in response.split(): await asyncio.sleep(0.1) # Simulate generation latency yield token + " "
Use Pydantic BaseModel to declare the shape of your API’s inputs and outputs. FastAPI reads these definitions to validate incoming JSON and generate the /docs schema automatically:
from pydantic import BaseModelfrom typing import Optionalclass QueryRequest(BaseModel): query: str context: Optional[str] = None class Config: schema_extra = { "example": { "query": "What is FastAPI?", "context": "I'm a beginner programmer." } }class QueryResponse(BaseModel): response: str class Config: schema_extra = { "example": { "response": "FastAPI is a modern, high-performance web framework." } }
Create the FastAPI instance, register the agent, and add the endpoints:
Sync endpoint
Streaming endpoint
SSE with sse-starlette
from fastapi import FastAPIfrom pydantic import BaseModelfrom typing import Optionalapp = FastAPI( title="Agent API", description="A simple API that serves an AI agent", version="0.1.0")agent = SimpleAgent()@app.get("/health")def health_check(): """Check if the API is running.""" return {"status": "ok", "message": "API is operational"}@app.post("/agent", response_model=QueryResponse)def query_agent(request: QueryRequest): """Get a synchronous response from the agent.""" response = agent.generate_response(request.query) return QueryResponse(response=response)
from fastapi.responses import StreamingResponseimport json@app.post("/agent/stream")async def stream_agent(request: QueryRequest): """Stream a response from the agent token by token.""" async def event_generator(): async for token in agent.generate_response_stream(request.query): data = json.dumps({"token": token}) yield f"data: {data}\n\n" return StreamingResponse( event_generator(), media_type="text/event-stream" )
The text/event-stream media type signals to clients that they should read the response incrementally rather than waiting for the body to complete.
from sse_starlette.sse import EventSourceResponseimport json@app.post("/agent/stream-sse")async def stream_agent_sse(request: QueryRequest): """Stream a response using the sse-starlette package.""" async def event_generator(): async for token in agent.generate_response_stream(request.query): yield {"data": json.dumps({"token": token})} return EventSourceResponse(event_generator())
EventSourceResponse from sse-starlette handles reconnection headers, event IDs, and keep-alive pings automatically, making it a better choice for production deployments.
For production use, protect your endpoints with an API key check. FastAPI’s dependency injection makes this a clean, reusable pattern:
from fastapi import Depends, HTTPException, Headerimport osasync def verify_api_key(x_api_key: str = Header(None)): """Verify the API key provided in the X-API-Key header.""" api_key = os.environ.get("API_KEY") # If no key is configured in the environment, skip validation if not api_key: return True if not x_api_key: raise HTTPException(status_code=401, detail="API Key is missing") if x_api_key != api_key: raise HTTPException(status_code=403, detail="Invalid API Key") return True# Inject the dependency into any endpoint@app.post("/agent", response_model=QueryResponse)def query_agent( request: QueryRequest, authorized: bool = Depends(verify_api_key)): response = agent.generate_response(request.query) return QueryResponse(response=response)
Set the API_KEY environment variable before starting the server: