Overview
Turn detection analyzes audio to identify:- Turn started: User began speaking
- Turn ended: User finished speaking, agent should respond
- Trailing silence: Duration of silence after speech
- Confidence: How confident we are the turn has ended
TurnDetector Interface
All turn detectors extend this abstract base:turn_detection.py:20-83
Base Class Features
TheTurnDetector base class provides:
Configuration
turn_detection.py:23-31
Event Emission
Helper methods for emitting events:turn_detection.py:33-57
Turn Events
TurnStartedEvent
Emitted when a user starts speaking:events.py:TurnStartedEvent
TurnEndedEvent
Emitted when a user finishes speaking:events.py:TurnEndedEvent
Agent Integration
The agent handles turn events automatically:agents.py:1301-1327
Barge-In Support
When a user starts speaking while the agent is talking, the agent interrupts itself:agents.py:1308-1327
Without Turn Detection
If you don’t provide a turn detector, the agent treats each STT transcript as a turn end:agents.py:461-468
Eager End-of-Turn
Some turn detectors support eager completion, where they signal a likely turn end before the user fully stops speaking:turn_detection.py:44
Configuration
Basic Setup
Confidence Threshold
Control how confident the detector must be:turn_detection.py:24
Session ID
Each detector instance gets a unique session ID:turn_detection.py:28
Conversation Context
Turn detectors receive conversation history for context-aware detection:turn_detection.py:60-65
Implementation Examples
Silence-Based Detection
Simple detector based on silence duration:VAD-Based Detection
Use Voice Activity Detection models:Best Practices
- Tune silence threshold: Balance responsiveness vs false positives
- Use VAD models: More accurate than simple energy detection
- Consider conversation context: Adjust detection based on dialogue flow
- Test with real users: Synthetic tests don’t capture natural pauses
- Monitor confidence scores: Log and analyze turn detection accuracy
- Support barge-in: Ensure TurnStartedEvent properly interrupts TTS
- Handle edge cases: Quick responses, long pauses, background noise
Debugging
Log turn events for analysis:Realtime Mode
Realtime LLMs handle turn detection internally:agents.py:120-122
Code References
- TurnDetector base class:
turn_detection.py:20-83 - Turn events:
events.py - Agent integration:
agents.py:1301-1327 - Barge-in handling:
agents.py:1308-1327 - Without turn detection:
agents.py:461-468
Next Steps
- Compare Realtime vs Interval modes
- Learn about Agents orchestration
- Explore Function Calling for tool use
- Understand Processors for custom logic