- Rubber Duck.app — macOS menu bar application for voice I/O and session management
- CLI daemon — local Node.js daemon managing Pi processes, IPC, and persistence
- duck CLI — terminal client for workspace attachment, event streaming, and text prompts
System Diagram
Component Responsibilities
Rubber Duck.app (Swift + SwiftUI)
Purpose: Voice interface and user experience- Captures audio with Voice Activity Detection (VAD)
- Streams audio to OpenAI Realtime API for STT/TTS
- Manages voice session lifecycle (connecting, listening, speaking, thinking)
- Executes barge-in (immediate TTS interruption on user speech)
- Displays menu bar status and settings UI
- Routes voice tool calls to daemon via Unix socket
- Syncs workspace/session state from CLI metadata
VoiceSessionCoordinator: Main state machine orchestrating audio, API, and daemonDaemonSocketClient: Unix socket client for IPC with daemonAudioManager: Microphone capture with VADAudioPlaybackManager: TTS playback with barge-in supportRealtimeClient: WebSocket client for OpenAI Realtime API
CLI Daemon (Node.js)
Purpose: Process coordination, session management, and IPC hub- Spawns and manages Pi RPC subprocesses (one per session)
- Provides Unix socket server for app and CLI clients
- Publishes Pi events to subscribed clients via event bus
- Persists workspace and session metadata
- Executes voice tool calls (read, write, edit, bash, grep, find)
- Auto-starts on first CLI invocation
daemon/main.ts: Entry point, lifecycle managementdaemon/pi-process-manager.ts: Session → PiProcess mapdaemon/pi-process.ts: Pi RPC subprocess wrapperdaemon/socket-server.ts: Unix socket NDJSON serverdaemon/event-bus.ts: Pub/sub for session eventsdaemon/metadata-store.ts: Atomic JSON persistencedaemon/voice-tools.ts: Tool execution for voice calls
duck CLI (Node.js)
Purpose: Workspace attachment, event streaming, text interaction- Attaches workspaces and resumes sessions
- Streams Pi events to terminal with formatted output
- Sends text prompts via
duck say - Handles extension UI requests with
@clack/prompts - Auto-starts daemon if not running
commands/default.ts:duck [path]attach and followcommands/say.ts: Send prompt and wait for completionrenderer/text-renderer.ts: Terminal output formattingrenderer/ui-handler.ts: Interactive promptsclient.ts: Daemon socket client
Data Flow Patterns
Voice Turn (User Speaks)
Tool Execution (Voice → Daemon → Pi)
CLI Attachment (Terminal → Daemon → Pi)
Communication Protocols
Daemon IPC (Unix Socket + NDJSON)
Socket Path:- Primary:
~/Library/Application Support/RubberDuck/daemon.sock - Fallback (if path too long):
$TMPDIR/rubber-duck-<hash>.sock
ping: Health checkattach: Create/resume workspace sessionfollow: Subscribe to session eventssay: Send text prompt to sessionvoice_connect: Register voice app connectionvoice_tool_call: Execute tool from voice sessionvoice_state: Query current voice session statesessions: List all sessionsdoctor: Run health diagnostics
Pi RPC (stdin/stdout + NDJSON)
The daemon spawns Pi with--mode rpc, establishing a bidirectional JSON protocol:
message_update: Streaming text/tool call deltastool_execution_start/update/end: Tool output streamingextension_ui_request: Interactive prompt neededagent_end: Turn complete
Session Model
Workspace: A directory (typically a git repo) with a unique ID based on path hash. Session: One Pi conversation history (JSONL file) bound to a workspace. Multiple sessions can exist per workspace. Active Voice Session: The session currently receiving voice input from the app. Only one session is active for voice at a time. Concurrent Sessions: Multiple Pi processes can run simultaneously. Background sessions stream events to CLI but don’t speak. Persistence:- Workspace/session metadata:
~/Library/Application Support/RubberDuck/metadata.json - Session history:
~/Library/Application Support/RubberDuck/pi-sessions/<sessionId>.jsonl(Pi-managed) - Voice conversation history:
~/Library/Application Support/RubberDuck/pi-sessions/<sessionId>.jsonl(app-managed, separate from Pi session)
File Locations
Runtime Dependencies
- macOS app: Swift 5, SwiftUI, macOS 15.2+ SDK, Network.framework
- CLI daemon: Node.js 22+, Pi CLI (
pibinary in PATH) - Pi: Installed separately, requires OpenAI/Anthropic API keys
Security Model
- Unix socket restricted to user (filesystem permissions)
- Daemon binds to localhost/Unix socket only (no network exposure)
- Tool execution confined to workspace directory (cwd)
- API keys stored in macOS Keychain (app) and environment variables (daemon/Pi)
- No telemetry or external services beyond LLM APIs
Next Steps
- Voice Pipeline — Audio capture, VAD, STT, TTS, OpenAI Realtime API
- Pi Integration — RPC mode, tool execution, event streaming
- Session Model — Workspaces, sessions, concurrency, persistence