HTTP Server Mode
Run agents as HTTP servers for production:POST /create_call- Create a new call and join it with the agentGET /health- Health check endpoint
FastAPI Integration
For more control, use FastAPI directly:Environment Configuration
Use environment variables for configuration:python-dotenv:
Docker Deployment
Create aDockerfile:
docker-compose.yml:
Kubernetes Deployment
Createdeployment.yaml:
Regional Deployment
Deploy agents close to users for optimal latency:Production Best Practices
Monitor Performance
See Observability for metrics and tracing.
Security Considerations
- Never commit secrets: Use environment variables or secret managers
- Validate input: Always validate call IDs, user input, webhooks
- Rate limiting: Implement rate limits for API endpoints
- HTTPS only: Always use HTTPS in production
- Webhook signatures: Verify Twilio webhook signatures
Scaling Considerations
- Horizontal scaling: Run multiple agent instances behind a load balancer
- Resource limits: Set appropriate CPU/memory limits
- Connection pooling: Reuse HTTP connections to AI providers
- Caching: Cache RAG embeddings and frequently used resources
- Async operations: All I/O is async for high concurrency
Example Deployment Commands
Next Steps
- Monitor agents: Observability
- Review complete examples in
examples/ - Check DEVELOPMENT.md for development guidelines