sandbox-exec to validate that allowed operations succeed and denied operations fail.
Running Tests
Run the full test suite from the repository root:Skipped tests typically occur when optional dependencies (like
git, docker, kubectl) are not installed on the test system.Test Structure
Tests are organized undertests/sections/ by functional area:
Test Helpers
All test sections use helpers fromtests/lib/common.sh:
Writing New Tests
When adding or modifying policy behavior:Create or update test section
Add a new function in the appropriate
tests/sections/*.sh file:tests/sections/20-integrations.sh
Use descriptive test names
Test descriptions should clearly state what behavior is being verified:✅ Good:
"write to --add-dirs path"❌ Bad: "test 3"Policy Assembly Tests
For changes to policy assembly logic or module dependencies, use structure and ordering assertions:The
#safehouse-test-id:*# markers in .sb files are used by ordering tests. Preserve these when editing profiles.CI Validation
GitHub Actions runs tests automatically on:- All pull requests
- Pushes to
main - macOS runners only (sandbox-exec is macOS-specific)
dist/ artifacts are up-to-date when policy or runtime files change.
Test Environment
The test harness (tests/lib/setup.sh) creates isolated directories:
| Variable | Purpose |
|---|---|
TEST_CWD | Temporary working directory for test commands |
TEST_HOME_CANARY | File path outside workdir (should be denied) |
TEST_RO_DIR | Directory used with --add-dirs-ro tests |
TEST_RW_DIR | Directory used with --add-dirs tests |
TEST_GIT_REPO | Temporary git repository for auto-detection tests |
Preflight Checks
The test runner performs these checks before starting:Sandbox nesting check
Verifies the current session is not already sandboxed (tests cannot run inside a sandbox).
2 and an explanation.
Debugging Test Failures
Check policy contents
Generated test policies are in
/tmp/safehouse-test-*/:Run commands manually
Execute test commands directly with
sandbox-exec:Watch denial logs
Stream sandbox denials while running tests:
Verify outside sandbox
Confirm the command works unsandboxed:
E2E and Live Agent Tests
For heavier integration testing:Next Steps
Debugging
Diagnose sandbox denial events
Contributing
Learn the development workflow and PR process