Overview
Voice tools allow the Rubber Duck voice agent to interact with your workspace through a secure, sandboxed environment. When you use voice commands, the OpenAI Realtime API makes tool calls that are executed by the daemon through thevoice-tools.ts implementation.
Architecture
Voice tools are executed in Node.js by the daemon process, not in the Swift app. This architecture provides:- Consistent behavior: Same tool implementation for both voice and CLI workflows
- Path containment: All file operations are validated against workspace boundaries
- Safe mode support: Optional restrictions on destructive operations
- Output limits: Protection against excessive data
Execution Flow
Voice command captured
The macOS app captures your voice through the microphone and streams audio to OpenAI’s Realtime API at 24 kHz PCM16 mono.
Tool call triggered
OpenAI’s model determines a tool is needed and emits a function call with parameters.
Daemon receives request
The Swift app forwards the tool call to the daemon via Unix socket using the
voice_tool_call method:Tool execution
The daemon executes the tool in
voice-tools.ts with workspace path validation and returns the result.Available Tools
Rubber Duck provides seven voice tools:read_file
Read file contents from the workspace
write_file
Create or overwrite files
edit_file
Make targeted edits using find-and-replace
bash
Execute shell commands with streaming output
grep_search
Search file contents using patterns
find_files
Find files by glob pattern
web_search
Search the web using Exa API (requires API key)
Security Features
Path Containment
All file operations validate that paths remain within the workspace root:Safe Mode
Safe mode restricts which tools and commands can be executed:- Disabled tools:
write_file,edit_file - Allowed bash commands: Only read-only operations like
git,grep,ls,cat, test commands
safeMode parameter when calling executeVoiceTool():
Output Limits
Tools enforce output size limits to prevent memory issues:Maximum bytes for bash and grep output (100 KB)
Maximum file size for read operations (1 MB)
Maximum number of files returned by find_files
Bash command timeout (30 seconds)
Error Handling
All tools return string results. Errors are prefixed with"Error: " for consistent parsing:
Implementation
The main dispatcher invoice-tools.ts routes tool calls:
Lenient Argument Parsing
Voice models sometimes emit simplified formats. The implementation handles common cases:Next Steps
File Operations
Learn about read_file, write_file, and edit_file
Bash Tool
Execute shell commands with streaming output
Search Tools
Search code with grep_search and find_files