Documentation Index
Fetch the complete documentation index at: https://mintlify.com/microsoft/agent-governance-toolkit/llms.txt
Use this file to discover all available pages before exploring further.
PolicyEvaluator is the core governance engine of the Agent Governance Toolkit. It loads declarative YAML policies, evaluates every agent action against them, and returns a structured PolicyDecision before the action is ever executed. Every tool call, delegation request, or message send can be intercepted and checked deterministically — actions the engine denies become structurally impossible.
The evaluator sits at the centre of the governance pipeline:
Installation
Import
Constructor
An optional list of already-loaded
PolicyDocument objects to seed the evaluator. When None, the policy list starts empty and policies can be added with load_policies().Optional root directory for folder-scoped policy discovery. When set and the evaluation context contains a
path key, the evaluator walks governance.yaml files from the action path up to this root and merges them hierarchically. Leave None for flat policy evaluation (the common case).Methods
load_policies()
.yaml and .yml file in directory as a PolicyDocument and appends them to the evaluator’s internal policy list. Can be called multiple times to load from multiple directories — all rules across all documents are merged and evaluated together, sorted by priority (descending).
Path to a directory containing YAML policy files. Files are loaded in sorted order.
evaluate()
context. Rules are sorted by priority (highest first); the first matching rule determines the decision. If no YAML rule matches, registered external backends are consulted in registration order. If nothing matches, the default action from the first loaded policy is applied (or a global deny if no policies are loaded — fail-closed).
A flat dictionary of action-level properties. Common keys include
tool_name, token_count, confidence, and message. Keys must match the field values in your policy condition blocks.Optional runtime context for v1 dynamic conditions (time-window, day-of-week, token budget, cost budget). Existing callers that omit this argument are unaffected. Structure:
PolicyDecision (see PolicyDecision fields below).
Error behaviour: All exceptions inside the evaluator are caught and converted to a fail-closed PolicyDecision(allowed=False, action="deny"). The engine never raises.
add_backend()
evaluate(context) -> BackendDecision and expose a name property.
An
ExternalPolicyBackend implementation such as OPABackend or CedarBackend from agent_os.policies.backends. When a backend’s evaluate() returns an error, the evaluator denies access immediately (fail-closed) without consulting the next backend.load_rego()
OPABackend in a single call.
Path to a
.rego file.Inline Rego policy string (use instead of
rego_path).Rego package name for query construction.
Evaluation mode:
"local", "remote", or "builtin".OPABackend instance.
load_cedar()
CedarBackend in a single call.
Path to a
.cedar policy file.Inline Cedar policy string.
Cedar entities for authorization context.
Evaluation mode:
"auto", "cedarpy", "cli", or "builtin".CedarBackend instance.
PolicyDecision Fields
Every call toevaluate() returns a PolicyDecision (a Pydantic BaseModel).
True if the action is permitted. False if it was denied or blocked.Name of the policy rule that fired.
None when the default action was applied or an external backend responded.The action taken:
"allow", "deny", "audit", or "block".Human-readable explanation of the decision. Comes from the matched rule’s
message field, the backend’s reason, or "No rules matched; default action applied".Structured audit data automatically attached to every decision.
Structured adaptation hints populated by dynamic-context rules. Empty for standard static rules.
PolicyDocument
PolicyDocument is a Pydantic BaseModel that represents the top-level structure of a YAML policy file.
Schema version string.
Human-readable policy identifier.
Free-text description of what the policy enforces.
Ordered list of
PolicyRule objects. Evaluated by priority (descending).Default action and budget limits applied when no rule matches.
When using folder-scoped evaluation, setting
False stops loading parent governance.yaml files.Glob pattern — the policy only applies when the action
path in context matches this pattern.Host patterns the sandbox may reach (e.g.
"pypi.org", "*.github.com"). Combined with defaults.network_default to form the sandbox egress policy. Consumed by sandbox providers; ignored by the rule engine.Tool names the agent may invoke. Enforced host-side by
PolicyEvaluator before any sandbox call.Class methods
PolicyRule Fields
Unique rule identifier within the policy document.
The condition evaluated against the context dictionary.
The action to take when the condition matches.
Rules with higher values are evaluated first. Two rules with the same priority are evaluated in document order.
Human-readable explanation returned in
PolicyDecision.reason.Optional runtime condition (time window, day-of-week, token budget, cost budget). Evaluated alongside the static
condition.If
True, replaces a parent rule with the same name during folder-level policy merging.PolicyCondition Fields
The key to look up in the evaluation
context dictionary. Examples: "tool_name", "token_count", "confidence".The comparison operator. See PolicyOperator below.
The right-hand side of the comparison. Type must be compatible with the operator (e.g., a list for
in, a number for gt).PolicyDefaults Fields
Fallback action when no rule matches. Defaults to
deny (fail-closed). Set to allow explicitly to opt into a permissive posture.Maximum token count per request evaluated by the rule engine.
Maximum tool invocations per request evaluated by the rule engine.
Minimum confidence score
[0.0–1.0] evaluated by the rule engine.Sandbox CPU limit in vCPUs (e.g.
0.5, 1.0). None = provider default. Consumed by sandbox providers; ignored by the rule engine.Sandbox memory limit in MiB.
None = provider default. Consumed by sandbox providers.Per-execute wall-clock cap in seconds.
None = provider default.Default sandbox egress action when a host is not on
network_allowlist. "deny" is fail-closed and is the default. Set to "allow" only for trusted dev/research workloads.PolicyAction Enum
| Value | allowed | Description |
|---|---|---|
ALLOW / "allow" | True | Permit the request. |
DENY / "deny" | False | Reject the request. |
AUDIT / "audit" | True | Permit but write an audit entry. |
BLOCK / "block" | False | Hard block; the reason is surfaced to the caller. |
PolicyOperator Values
| Enum member | YAML string | Behaviour |
|---|---|---|
EQ | "eq" | Exact equality |
NE | "ne" | Not equal |
GT | "gt" | Greater than |
LT | "lt" | Less than |
GTE | "gte" | Greater than or equal |
LTE | "lte" | Less than or equal |
IN | "in" | Context value is in the target list |
NOT_IN | "not_in" | Context value is not in the target list |
CONTAINS | "contains" | Target is a substring of the context value |
MATCHES | "matches" | Context value matches the regex in target |
Exceptions
PolicyViolationError
PolicyError → AgentOSError → Exception.
"POLICY_VIOLATION" by default.Structured details including
category, matched_rule, detail, scope, operation, tool_name, and all fields from the audit_entry.ISO 8601 UTC timestamp when the error was raised.
The underlying
PolicyCheckResult if the error was created from one, otherwise None.PolicyDeniedError
error_code is "POLICY_DENIED".
Code Examples
Basic: Load policies and evaluate
Programmatic policy construction
Handling a deny decision
Serialise to YAML for version control
Policy YAML Reference
OPA Backend Integration
Register an OPA/Rego backend for policies that require external evaluation:See Also
- govern() wrapper — two-line integration for any callable
- AuditLogger — write decisions to a tamper-evident log
- AgentIdentity — cryptographic agent identity