Documentation Index
Fetch the complete documentation index at: https://mintlify.com/microsoft/agent-governance-toolkit/llms.txt
Use this file to discover all available pages before exploring further.
AI agents that can read files, call APIs, and execute code need strict runtime boundaries. Without sandboxing, a misbehaving agent can exfiltrate data, corrupt state, exhaust resources, or cascade failures across a multi-agent workflow. The Agent Runtime (pip install agentmesh-runtime) addresses this with a hardware-inspired four-ring privilege model: just as an OS kernel uses CPU privilege rings to separate user processes from kernel operations, the Agent Hypervisor assigns agents to execution rings based on their trust scores and enforces resource boundaries at each level.
┌─────────────────────────────────────────────────┐
│ Execution Ring Model │
│ Ring 0 (Root) → Ring 3 (Sandbox) │
├─────────────────────────────────────────────────┤
│ Capability Guards │
│ Per-agent tool allow/deny lists │
├─────────────────────────────────────────────────┤
│ Saga Orchestration │
│ Multi-step transactions with auto-rollback │
├─────────────────────────────────────────────────┤
│ Session Isolation │
│ VFS namespacing, vector clocks, intent locks │
├─────────────────────────────────────────────────┤
│ Emergency Controls │
│ Kill switch, rate limiting, breach detection │
└─────────────────────────────────────────────────┘
The Four-Ring Model
The hypervisor implements exactly four execution rings. Ring ordering is numerical: lower value = higher privilege. Ring 0 is the most privileged; Ring 3 is the default for unknown or untrusted agents.
| Ring | Name | Effective Score | Description |
|---|
| Ring 0 | Root | N/A (SRE Witness required) | Hypervisor configuration and penalty operations. System-only; never assigned by score alone. |
| Ring 1 | Privileged | > 0.95 + consensus | Non-reversible actions with full resource access (write, deploy, delete). |
| Ring 2 | Standard | > 0.60 | Reversible actions with scoped resource access (write files, call APIs). |
| Ring 3 | Sandbox | Default | Read-only and research actions with minimal resource access. No network, no writes. |
Ring 0 is never accessible to agents through the standard API. Ring 0 operations require explicit SRE Witness attestation through an out-of-band mechanism. This prevents any compromised agent from escalating to hypervisor-level privileges.
Ring Assignment from Effective Score
Ring assignment from an agent’s effective score (eff_score) follows strict threshold rules:
from hypervisor.models import ExecutionRing
# eff_score > 0.95 AND has_consensus=True → Ring 1 (Privileged)
ring = ExecutionRing.from_eff_score(eff_score=0.98, has_consensus=True)
assert ring == ExecutionRing.RING_1_PRIVILEGED
# eff_score > 0.60 → Ring 2 (Standard)
ring = ExecutionRing.from_eff_score(eff_score=0.75)
assert ring == ExecutionRing.RING_2_STANDARD
# Otherwise → Ring 3 (Sandbox, default)
ring = ExecutionRing.from_eff_score(eff_score=0.40)
assert ring == ExecutionRing.RING_3_SANDBOX
Action Classification
Every action is automatically classified to a required ring based on its reversibility, is_read_only, and is_admin flags:
| Condition | Required Ring |
|---|
is_admin == true | Ring 0 (Root) |
reversibility == NONE and is_read_only == false | Ring 1 (Privileged) |
is_read_only == true | Ring 3 (Sandbox) |
| Otherwise (reversible, not read-only) | Ring 2 (Standard) |
from hypervisor.rings.classifier import ActionClassifier
from hypervisor.models import ReversibilityLevel
classifier = ActionClassifier()
result = classifier.classify_action_id("file.read")
assert result.ring == ExecutionRing.RING_3_SANDBOX
assert result.reversibility == ReversibilityLevel.REVERSIBLE
result = classifier.classify_action_id("deploy.k8s")
assert result.ring == ExecutionRing.RING_1_PRIVILEGED
assert result.reversibility == ReversibilityLevel.NON_REVERSIBLE
# Override classification for custom actions
classifier.set_override("my_custom.action", ring=ExecutionRing.RING_2_STANDARD, risk_weight=0.5)
Resource Limits Per Ring
Each ring has associated ResourceConstraints that govern what the agent can access:
| Ring | Network | Filesystem Scope | Subprocess | Max Concurrent Tools |
|---|
| Ring 0 (Root) | Yes | full | Yes | 32 |
| Ring 1 (Privileged) | Yes | full | Yes | 16 |
| Ring 2 (Standard) | Yes (allowlist) | scoped | Yes | 8 |
| Ring 3 (Sandbox) | No | none | No | 2 |
Filesystem scope semantics:
| Scope | Meaning |
|---|
none | No filesystem access |
session | Agent’s own session directory only |
scoped | Agent’s directories plus explicitly granted paths |
full | Unrestricted filesystem access |
Rate limits per ring (token bucket algorithm):
| Ring | Rate (req/s) | Burst |
|---|
| Ring 0 (Root) | 100.0 | 200.0 |
| Ring 1 (Privileged) | 50.0 | 100.0 |
| Ring 2 (Standard) | 20.0 | 40.0 |
| Ring 3 (Sandbox) | 5.0 | 10.0 |
Privilege Elevation
Agents can request temporary access to a higher ring. Elevations are time-bounded, trust-gated, and automatically expire.
from hypervisor.rings.elevation import RingElevationManager
manager = RingElevationManager()
# Request elevation from Ring 2 → Ring 1
elevation = manager.request_elevation(
agent_did="did:example:agent-42",
session_id="session-001",
current_ring=ExecutionRing.RING_2_STANDARD,
target_ring=ExecutionRing.RING_1_PRIVILEGED,
ttl_seconds=300, # 5-minute window (max: 3600s)
reason="Deploying approved release v2.1.0",
attestation="signed-approval-token-from-sre",
trust_score=0.92,
)
if elevation.is_active:
effective = manager.get_effective_ring(
agent_did="did:example:agent-42",
session_id="session-001",
base_ring=ExecutionRing.RING_2_STANDARD,
)
assert effective == ExecutionRing.RING_1_PRIVILEGED
# Revoke early if needed
manager.revoke_elevation(elevation.elevation_id)
Trust thresholds for elevation:
| Target Ring | Required Trust Score |
|---|
| Ring 1 (Privileged) | ≥ 0.85 |
| Ring 2 (Standard) | ≥ 0.50 |
Denial reasons:
| Reason | Condition |
|---|
ring_0_forbidden | Target is Ring 0 — never allowed via standard API |
insufficient_trust | Trust score below threshold |
no_sponsorship | Ring 1 elevation without SRE attestation |
duplicate_elevation | Agent already has an active elevation in this session |
invalid_target | Target ring is same or lower privilege than current |
Saga Orchestration
Multi-step agent workflows risk leaving the system in a broken half-finished state when a step fails. The Saga Orchestrator wraps multi-step workflows in distributed transactions with automatic compensation (rollback) in reverse order.
How Sagas Work
Step 1: Create PR ──→ Compensate: Close PR
Step 2: Run tests ──→ Compensate: Cancel test run
Step 3: Deploy to staging ──→ Compensate: Rollback deployment
Step 4: Notify team ──→ Compensate: Send failure notice
If Step 3 fails:
→ Compensate Step 2 (cancel tests)
→ Compensate Step 1 (close PR)
→ Saga state: COMPENSATING → FAILED
Creating a Saga
from hypervisor.saga.orchestrator import SagaOrchestrator
orchestrator = SagaOrchestrator()
# Create a new saga
saga = orchestrator.create_saga(session_id="session-deploy-42")
# Add steps with execute and undo APIs
orchestrator.add_step(
saga_id=saga.saga_id,
action_id="pr.create",
agent_did="did:example:dev-agent",
execute_api="/api/pr/create",
undo_api="/api/pr/close", # compensation action
timeout_seconds=60,
max_retries=2,
)
orchestrator.add_step(
saga_id=saga.saga_id,
action_id="deploy.staging",
agent_did="did:example:deploy-agent",
execute_api="/api/deploy/staging",
undo_api="/api/deploy/rollback",
timeout_seconds=600,
)
Step and Saga State Machines
Each step transitions through a well-defined state machine:
PENDING → EXECUTING → COMMITTED
↘ FAILED → COMPENSATING → COMPENSATED
↘ COMPENSATION_FAILED
Saga-level states:
| State | Meaning |
|---|
RUNNING | Steps are being executed sequentially |
COMPENSATING | A step failed; compensation is running in reverse |
COMPLETED | All steps committed successfully |
FAILED | All compensation finished (or some compensation failed) |
ESCALATED | Compensation itself failed; human intervention required |
Declarative Sagas with the DSL
from hypervisor.saga.dsl import SagaDSLParser
saga_yaml = """
saga:
id: deploy-pipeline
steps:
- id: create-pr
action_id: pr.create
agent: did:example:dev-agent
execute_api: /api/pr/create
undo_api: /api/pr/close
timeout: 60
retries: 2
- id: run-tests
action_id: tests.run
agent: did:example:ci-agent
execute_api: /api/tests/run
undo_api: /api/tests/cancel
timeout: 300
depends_on: [create-pr]
- id: deploy-staging
action_id: deploy.staging
agent: did:example:deploy-agent
execute_api: /api/deploy/staging
undo_api: /api/deploy/rollback
timeout: 600
depends_on: [run-tests]
checkpoint_goal: "Staging deployment matches PR diff"
"""
parser = SagaDSLParser()
definition = parser.parse(saga_yaml)
Saga defaults: max_retries=2, retry_delay_seconds=1.0 (linear backoff), step_timeout_seconds=300.
Kill Switch
The KillSwitch terminates an agent immediately and triggers saga compensation for any in-flight work. It is the last resort for stopping a misbehaving agent.
from hypervisor.security.kill_switch import KillSwitch, KillReason, KillResult
kill_switch = KillSwitch()
result: KillResult = kill_switch.kill(
agent_did="did:example:rogue-agent",
session_id="session-001",
reason=KillReason.BEHAVIORAL_DRIFT,
details="Agent started accessing files outside its namespace",
)
print(f"Kill ID: {result.kill_id}")
print(f"Compensation triggered: {result.compensation_triggered}")
print(f"Handoffs succeeded: {result.handoff_success_count}")
print(f"Terminated: {result.terminated}")
Kill reasons:
| Reason | When to Use |
|---|
BEHAVIORAL_DRIFT | Agent deviates from expected behavior patterns |
RATE_LIMIT | Agent exceeded its rate limit repeatedly |
RING_BREACH | Agent attempted actions above its ring level |
MANUAL | Human operator triggered the kill |
QUARANTINE_TIMEOUT | Agent was quarantined and did not recover |
SESSION_TIMEOUT | Session exceeded its max_duration_seconds |
Graceful Shutdown with Step Handoff
Register a substitute agent before killing the primary to enable zero-downtime handoff:
# Register a substitute to take over in-flight work
kill_switch.register_substitute(
session_id="session-001",
agent_did="did:example:backup-agent",
)
result = kill_switch.kill(
agent_did="did:example:primary-agent",
session_id="session-001",
reason=KillReason.MANUAL,
details="Planned maintenance rotation",
)
# Check handoff results
for handoff in result.handoffs:
print(f"Step {handoff.step_id}: {handoff.status}")
# HandoffStatus: PENDING, HANDED_OFF, FAILED, COMPENSATED
If no substitute is registered or the callback times out (5-second timeout), terminated is set to false and the kill result is still recorded.
Resource Limits
Termination Control Signals
An agent’s session can be bounded by several configurable limits:
from hypervisor.models import SessionConfig, ConsistencyMode
config = SessionConfig(
consistency_mode=ConsistencyMode.STRONG,
max_participants=5,
max_duration_seconds=3600, # session hard stop after 1 hour
min_eff_score=0.60, # minimum trust to join the session
enable_audit=True,
enable_blockchain_commitment=False,
)
Rate Limiting
Per-agent rate limits use a token bucket algorithm. When the bucket is exhausted, RateLimitExceeded is raised:
from hypervisor.models import ExecutionRing
from hypervisor.security.rate_limiter import AgentRateLimiter, RateLimitExceeded
# One limiter covers all rings; per-ring limits are configured automatically.
limiter = AgentRateLimiter()
# try_check() returns False instead of raising when limit is exceeded.
allowed = limiter.try_check(
agent_did="did:example:new-agent",
session_id="session-001",
ring=ExecutionRing.RING_3_SANDBOX,
)
if not allowed:
print("Rate limited — request denied")
# check() raises RateLimitExceeded on exhaustion.
try:
limiter.check(
agent_did="did:example:new-agent",
session_id="session-001",
ring=ExecutionRing.RING_3_SANDBOX,
)
except RateLimitExceeded as exc:
print(f"Rate limit exceeded: {exc}")
# Update the bucket when the agent's ring changes (e.g., after promotion).
limiter.update_ring(
agent_did="did:example:new-agent",
session_id="session-001",
new_ring=ExecutionRing.RING_2_STANDARD,
)
Session Isolation
Each session uses a SessionVFS that provides per-agent isolated file views. Agents cannot read or write each other’s files without explicit cross-session access grants.
from hypervisor.session.sso import SessionVFS, VFSPermissionError
vfs = SessionVFS()
# Agent A writes — only Agent A can see it
vfs.write(path="/workspace/plan.md", agent_did="did:agent-a", value="# My Plan")
# Agent B cannot read Agent A's file
try:
vfs.read(path="/workspace/plan.md", agent_did="did:agent-b")
except VFSPermissionError:
print("Access denied: Agent B cannot read Agent A's namespace")
# Agent B writes to the same path — gets its own isolated copy
vfs.write(path="/workspace/plan.md", agent_did="did:agent-b", value="# Different Plan")
assert vfs.read("/workspace/plan.md", "did:agent-a") == "# My Plan"
assert vfs.read("/workspace/plan.md", "did:agent-b") == "# Different Plan"
Isolation levels:
| Level | Description | Cost |
|---|
SNAPSHOT | Complete isolation — agent sees only its own session state | Low |
READ_COMMITTED | Can read from granted sessions; writes to own only | Medium |
SERIALIZABLE | Full causal ordering with vector clocks and intent locks | High |
Breach Detection Pipeline
Wire breach detection and rate limiting together for automated enforcement:
from hypervisor.rings.breach_detector import RingBreachDetector, BreachSeverity
from hypervisor.security.kill_switch import KillSwitch, KillReason
from hypervisor.security.rate_limiter import AgentRateLimiter
detector = RingBreachDetector()
kill_switch = KillSwitch()
limiter = AgentRateLimiter(window_seconds=60.0, max_calls=100)
async def on_agent_action(agent_did: str, session_id: str, action_id: str):
"""Enforcement pipeline for every agent action."""
# Layer 1: Rate limit check
status = limiter.check_rate_limit(agent_did)
if not status.allowed:
kill_switch.kill(agent_did, session_id, KillReason.RATE_LIMIT)
return
# Layer 2: Ring enforcement
# Breach severity: 1-ring gap → WARNING, 2-ring gap → HIGH, 3-ring gap → CRITICAL
# CRITICAL severity triggers immediate kill
# Layer 3: Capability guard check (handled by governance middleware)
# Layer 4: Saga step execution (handled by orchestrator)
Ring demotion is automatic: when an agent’s effective score drops below the threshold for their current ring, should_demote() returns True and the agent is moved down to the ring matching their current score.