Design Philosophy

Agent Safehouse is built around practical least privilege: strong default constraints with minimal workflow friction.

Core Principles

The design follows four guiding principles:

Start from Deny-All

Begin with (deny default) and only allow what’s needed

Allow Only What's Required

Each permission must answer: does the agent need this to do useful work?

Keep Workflows Productive

Default grants should support normal coding without constant overrides

Make Risk Reduction Easy

Secure-by-default behavior should be the path of least resistance

Not a Security Boundary

Safehouse is a hardening layer, not a perfect security boundary against a determined attacker.

This distinction is critical:

What Safehouse IS

Blast radius reduction: Limits damage from mistakes, confusion, or simple attacks
Least privilege enforcement: Restricts filesystem access to what’s actually needed
Defense in depth: Adds a meaningful layer to your security posture
Practical containment: Works with real workflows without major disruption

What Safehouse IS NOT

Perfect isolation: Not a VM boundary; shares the host kernel
Escape-proof: sandbox-exec has been bypassed before and may be again
Network protection: Cannot prevent exfiltration of allowed files over the network
Credential vault: Cannot protect credentials that must be accessible for the task

Design Tradeoffs

Every security tool makes tradeoffs. Safehouse explicitly chooses:

Usability Over Paranoia

Rationale

If security is too burdensome, users will disable it. Safehouse aims to be secure enough to use daily without constant friction.Implication: Some integrations (like network access) are allowed by default because denying them would break too many workflows.

Example: Network access is allowed by default because:

Package managers need registries (npm, pip, cargo)
Git needs to fetch/push to remotes
MCP servers need network connections
LLM APIs need network access

Denying network by default would make Safehouse impractical for most users.

Host-Native Over Isolation Purity

Rationale

VMs provide stronger isolation but require duplicate toolchains, workspace syncing, and credential management. Safehouse prioritizes native host compatibility.Implication: You get filesystem containment without the overhead of a separate guest OS.

Example: Your sandboxed agent uses:

The same node, python, go binaries as your normal shell
The same package manager caches
The same git config
The same editor/IDE

No duplication or syncing required.

Composability Over Monolithic Policies

Rationale

Different tasks need different permissions. Modular profiles let you enable only what’s needed for the current task.Implication: Policy assembly is more complex, but you get fine-grained control without rewriting entire policies.

Example: Three different tasks with three different permission sets:

# Basic coding: minimal permissions
safehouse aider

# Docker workflow: add docker socket
safehouse --enable=docker -- aider

# Cloud deployment: add cloud credentials
safehouse --enable=cloud-credentials --enable=kubectl -- aider

Deny-First Over Allow-List

Rationale

Starting from deny-all means new filesystem locations are blocked by default. This is safer than an allow-list that might miss sensitive paths.Implication: You may need to add --add-dirs-ro for cross-repo references, but you won’t accidentally expose sensitive files.

Example: If you create a new ~/secrets directory:

Without sandbox: Agent can access it immediately (unsafe)
With Safehouse: Agent is denied access unless you explicitly grant it (safe)

Threat Model

Safehouse is designed to protect against:

✅ Prompt Injection

Threat: Malicious instructions embedded in files, docs, or web contentProtection: Agent cannot access SSH keys, cloud credentials, or other repos even if instructed to

✅ Confused Deputy

Threat: Agent misinterprets vague instructions and performs unintended actionsProtection: Damage is limited to the workdir; cannot touch unrelated projects or personal files

✅ Buggy Commands

Threat: Agent generates a command with typos or wrong paths (rm -rf in wrong directory)Protection: Filesystem constraints prevent deletion or modification outside the workdir

✅ Supply Chain Risks

Threat: Compromised agent tool or MCP server attempts to steal credentialsProtection: Credentials outside the sandbox policy are inaccessible

Not Designed to Protect Against

❌ Sophisticated Attackers

Threat: Adversary actively researching sandbox escapesWhy: sandbox-exec is not a VM boundary; escapes have existed and will exist againRecommendation: Use a VM for adversarial scenarios

❌ Network Exfiltration

Threat: Agent sends allowed file contents to attacker-controlled serverWhy: Network is allowed by default for functionalityRecommendation: Use network monitoring or air-gapped VMs for sensitive data

❌ Authorized IPC Abuse

Threat: Agent uses allowed IPC channels (like git operations) to leak dataWhy: Blocking all IPC would break normal workflowsRecommendation: Monitor git commits and network activity

Key Design Decisions

Why Sandbox-Exec?

Native macOS Integration

sandbox-exec is built into macOS and used by Apple’s own apps. It’s well-tested, performant, and requires no kernel extensions or system modifications.

No External Dependencies

The core wrapper is pure Bash + Sandbox Profile Language. No compilation, no build step, no external runtime.

Fine-Grained Filesystem Control

Sandbox Profile Language supports exact paths, recursive subpaths, prefixes, and regex matchers - perfect for coding workflows.

Why Composable Profiles?

Task-Specific Permissions

Different tasks need different permissions. Composability lets you grant only what’s needed without maintaining separate monolithic policies.

Maintainability

Small, focused profiles are easier to audit and update than large, monolithic policy files.

Extensibility

New integrations can be added as new profiles without modifying existing ones.

Why Allow Network by Default?

Fundamental to Coding Workflows

Package managers, git remotes, MCP servers, and LLM APIs all require network access. Denying by default would make Safehouse unusable for most people.

Mitigation Through Filesystem Control

Even with network access, the agent can only exfiltrate files it’s allowed to read. Filesystem containment is the primary defense.

Why Deny Shell Startup Files?

Prevents Credential Leakage

Many users put API keys, tokens, and other secrets in .zshrc or .bashrc for convenience. Denying access by default protects these.

Explicit Environment Control

Safehouse provides a sanitized environment. If you need specific env vars, use --pass-env or --keep-env explicitly.

Philosophy in Practice

Example 1: SSH Keys

Decision: Deny ~/.ssh/id_* by default, allow ~/.ssh/config and ~/.ssh/known_hosts Rationale:

Git-over-SSH needs config and known_hosts to connect to remotes
Private keys themselves are not needed (SSH agent handles auth)
Reading private keys provides no legitimate value but high risk

Result: Git workflows work, but private keys are protected.

Example 2: Package Manager Caches

Decision: Allow read/write to ~/.npm, ~/.cargo, ~/.cache/pip, etc. Rationale:

Agents frequently need to install dependencies
Denying cache access would force re-downloads and break workflows
Cache contents are not sensitive (public packages)

Result: Package managers work normally with minimal overhead.

Example 3: Clipboard Access

Decision: Deny by default, require --enable=clipboard Rationale:

Not needed for most coding tasks
Users often copy sensitive data temporarily
Explicit opt-in prevents surprise clipboard access

Result: Users grant clipboard access only when needed for the current task.

Philosophical Alignment

Safehouse aligns with the principle of least privilege as defined in classic security literature:

Every program and every user of the system should operate using the least set of privileges necessary to complete the job. — Jerome Saltzer, Communications of the ACM, 1974

But it balances this with the reality that:

Security mechanisms that are too restrictive get disabled or bypassed. Practical security must account for human behavior and workflow needs.

Get Started

Core Concepts

Usage

Advanced

Operations

Agent Compatibility

Documentation Index

​Design Philosophy

​Core Principles

Start from Deny-All

Allow Only What's Required

Keep Workflows Productive

Make Risk Reduction Easy

​Not a Security Boundary

​What Safehouse IS

​What Safehouse IS NOT

​Design Tradeoffs

​Usability Over Paranoia

​Host-Native Over Isolation Purity

​Composability Over Monolithic Policies

​Deny-First Over Allow-List

​Threat Model

✅ Prompt Injection

✅ Confused Deputy

✅ Buggy Commands

✅ Supply Chain Risks

​Not Designed to Protect Against

❌ Sophisticated Attackers

❌ Network Exfiltration

❌ Authorized IPC Abuse

​Key Design Decisions

​Why Sandbox-Exec?

​Why Composable Profiles?

​Why Allow Network by Default?

​Why Deny Shell Startup Files?

​Philosophy in Practice

​Example 1: SSH Keys

​Example 2: Package Manager Caches

​Example 3: Clipboard Access

​Philosophical Alignment

​Next Steps

Isolation Models

Default Assumptions

Build docs developers (and LLMs) love

Design Philosophy

Core Principles

Not a Security Boundary

What Safehouse IS

What Safehouse IS NOT

Design Tradeoffs

Usability Over Paranoia

Host-Native Over Isolation Purity

Composability Over Monolithic Policies

Deny-First Over Allow-List

Threat Model

Not Designed to Protect Against

Key Design Decisions

Why Sandbox-Exec?

Why Composable Profiles?

Why Allow Network by Default?

Why Deny Shell Startup Files?

Philosophy in Practice

Example 1: SSH Keys

Example 2: Package Manager Caches

Example 3: Clipboard Access

Philosophical Alignment

Next Steps