Core Concepts
Workspace
A workspace is a directory (typically a git repo) with a unique ID. Workspace ID: SHA-256 hash of the absolute, standardized path:duck ., the CLI searches upward for .git and uses the repo root as the workspace path. This ensures consistent workspace IDs across subdirectories.
Session
A session is one conversation thread bound to a workspace. Session ID: UUID v4 (e.g.,550e8400-e29b-41d4-a716-446655440000)
Metadata:
- Created: User runs
duck attachorduck new - Active: Currently selected for voice or CLI input
- Running: Pi agent is processing a turn
- Idle: Pi is waiting for next prompt
- Terminated: Pi process exited (on demand or error)
Active Voice Session
The active voice session is the session that receives voice input when the user presses the hotkey.- Only one session can be active for voice at a time
- Voice session can differ from CLI session (e.g., talk to one session, monitor another in terminal)
- Tracked in
metadata.jsonat the workspace level
metadata.json and pushes a voice_session_changed event to the app.
Persistence
Metadata Store (metadata.json)
All workspace and session metadata is stored in a single JSON file:
cli/src/daemon/metadata-store.ts:106-118):
To prevent corruption, updates use write-rename:
Pi Session Files (JSONL)
Pi stores conversation history as newline-delimited JSON:- Specifies
--session-dirwhen spawning Pi - Tracks the file path in
metadata.json - Reads history on demand for diagnostics
--session <file>, it loads the full history into context (subject to token limits). This enables seamless continuation across app restarts.
Voice Conversation History
The macOS app maintains a separate conversation history for voice sessions:- Voice sessions use OpenAI Realtime API, not Pi
- History includes audio transcripts, barge-in events, and tool calls
- Pi session files are for Pi’s internal state
RubberDuck/ConversationHistory.swift):
Concurrency
Multiple Pi Processes
Rubber Duck supports concurrent sessions with multiple Pi subprocesses:- Independent Pi subprocess
- Independent event stream
- Independent conversation history
cli/src/daemon/pi-process-manager.ts):
Voice Exclusivity
While multiple sessions can run concurrently, voice input is exclusive:- Only one session is the “active voice session”
- When the user speaks, audio goes to that session’s Realtime API connection
- Other sessions continue running but don’t speak
- Background sessions can trigger notifications (future feature)
CLI Following
Each CLI client can follow one session at a time, but multiple clients can follow the same session: Subscription Model:cli/src/daemon/event-bus.ts:23-34):
Session Operations
Attach Workspace
- CLI: Resolve workspace path (git root if in repo)
- CLI → Daemon:
attachrequest - Daemon: Upsert workspace in
metadata.json - Daemon: Check for existing active session
- If exists: Resume that session
- If none: Create new session with auto-generated ID
- Daemon: Spawn Pi with
--session <file> - Daemon: Subscribe client to session events
- Daemon → CLI: Return session metadata
- CLI: Start rendering event stream
Create New Session
- CLI → Daemon:
new_sessionrequest with workspace and optional name - Daemon: Generate UUID for session
- Daemon: Create session entry in
metadata.json - Daemon: Spawn Pi with fresh history (no
--sessionflag) - Daemon: Set as active session for workspace
- Daemon → CLI: Return session metadata
Switch Active Session
- CLI → Daemon:
use_sessionrequest with session name/ID - Daemon: Resolve session (by name, full ID, or prefix)
- Daemon: Update workspace’s
activeSessionIdinmetadata.json - Daemon: Broadcast
voice_session_changedevent to voice app - Daemon → CLI: Confirm switch
List Sessions
- CLI → Daemon:
sessionsrequest (optionally filtered by workspace) - Daemon: Query
metadata.jsonfor sessions - Daemon: Check Pi process status (alive, exit code)
- Daemon → CLI: Return session list with status
- CLI: Format as table
Abort Session
- CLI → Daemon:
abortrequest for active session - Daemon: Forward
abortcommand to Pi process - Pi: Stops current tool execution, cancels pending operations
- Pi → Daemon:
agent_endevent withreason: "aborted" - Daemon → CLI: Stream abort event
- CLI: Display abort confirmation
Session Resolution
The daemon accepts multiple session identifiers:- Session Name:
duck use debug-tests - Full Session ID:
duck use 550e8400-e29b-41d4-a716-446655440000 - Unambiguous Prefix:
duck use 550e(if no other session starts with550e) - Default: Active session for current workspace (if in workspace) or global active session
Workspace Confinement
All Pi operations are confined to the workspace directory: Pi Spawn (cli/src/daemon/pi-process.ts:76-80):
cli/src/daemon/voice-tools.ts):
- Tools can escape workspace via absolute paths or
..(not currently blocked) - Future: Add
--sandboxmode to enforce strict confinement - Bash commands can access network (e.g.,
curl) - Future: Add “Safe mode” to restrict bash to allowlist (PRD Section 13)
Session Cleanup
Manual Cleanup
- Daemon: Send
SIGTERMto Pi process - Daemon: Wait up to 5s for graceful exit
- Daemon: Send
SIGKILLif still alive - Daemon: Remove from active process map
- Daemon: Update session status in
metadata.json
Automatic Cleanup
- Daemon Shutdown: All Pi processes receive
SIGTERM→SIGKILL - Process Crash: Health monitor detects dead process, publishes
pi_diedevent, updates metadata - CLI Disconnect: Daemon unsubscribes client but keeps Pi process alive (background work)
Purge Old Sessions
- Daemon: Query sessions with
lastActiveAt> 30 days ago - Daemon: Kill any running processes for those sessions
- Daemon: Delete session entries from
metadata.json - Daemon: Optionally delete session JSONL files
- Daemon → CLI: Report purged sessions
Metadata Schema
Workspace
Session
Voice Session Selection (CLI Metadata)
The CLI writes a lightweight metadata file for the voice app to read:Performance Characteristics
Metadata Operations
- Workspace lookup: O(1) hash map
- Session lookup: O(1) hash map
- Session list: O(n) iteration (n = total sessions)
- Metadata write: ~10ms (JSON serialize + atomic rename)
Pi Process Overhead
- Spawn time: ~200ms (Pi initialization + model config)
- Memory per process: ~50-100 MB (base) + model context
- Concurrent limit: No hard limit, but recommend <10 simultaneous sessions (each runs a full agent loop)
Event Bus Throughput
- Event publish: O(m) where m = subscribed clients for that session
- Typical latency: <5ms from Pi stdout → client socket
- Backpressure: Slow clients block event delivery (trade-off: simplicity vs. async buffering)
Next Steps
- Architecture Overview — Full system diagram
- Voice Pipeline — Audio I/O and OpenAI Realtime API
- Pi Integration — RPC protocol and tool execution