Agent OS Policy Engine Specification 1.0 — RFC 2119

This specification defines the behavioral contract for the Agent OS policy engine: the single enforcement point through which all governed agent actions flow. All SDK implementations — Python, TypeScript, Rust, .NET, and Go — MUST conform to this specification. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHOULD”, “RECOMMENDED”, and “MAY” are interpreted as described in RFC 2119 and RFC 8174.

Status

Draft

Date

2026-05-16

Conformance Tests

68 tests

Policy Document Schema

A PolicyDocument encodes declarative governance rules in YAML or JSON. Implementations MUST support loading from YAML; SHOULD also support JSON.

Required and Optional Fields

Field	Type	Required	Default	Description
`version`	string	No	`"1.0"`	Schema version identifier.
`name`	string	No	`"unnamed"`	Human-readable name for audit logs.
`description`	string	No	`""`	Free-form description.
`rules`	array	No	`[]`	Ordered list of `PolicyRule` objects.
`defaults`	object	No	See below	Default settings when no rule matches.
`inherit`	boolean	No	`true`	Whether parent policies are loaded during discovery.
`scope`	string or null	No	`null`	Glob pattern restricting which action paths this policy applies to.

Defaults Object

Field	Type	Default	Description
`action`	PolicyAction	`"allow"`	Default action when no rule matches.
`max_tokens`	integer	`4096`	Maximum tokens per request.
`max_tool_calls`	integer	`10`	Maximum tool invocations per request.
`confidence_threshold`	float	`0.8`	Minimum confidence score (0.0–1.0).

Policy Rule Schema

Each PolicyRule MUST contain:

Field	Type	Required	Default	Description
`name`	string	Yes	—	Unique rule identifier within the document.
`condition`	PolicyCondition	Yes	—	The matching condition.
`action`	PolicyAction	Yes	—	Action taken when condition matches.
`priority`	integer	No	`0`	Higher values are evaluated first.
`message`	string	No	`""`	Human-readable explanation in decisions and audit entries.
`override`	boolean	No	`false`	If true, replaces a parent rule with the same name during folder-level merge.

A PolicyCondition MUST contain exactly three fields:

Field	Type	Description
`field`	string	Dot-path into the execution context (`"tool_name"`, `"token_count"`).
`operator`	PolicyOperator	Comparison operator (see table below).
`value`	any	Target value for comparison.

Condition Operators

Conforming implementations MUST support all nine operators:

Operator	Semantics	Example
`eq`	Context value equals target value.	`tool_name eq "execute_code"`
`ne`	Context value does not equal target value.	`agent_id ne "admin"`
`gt`	Context value is greater than target.	`token_count gt 4096`
`lt`	Context value is less than target.	`priority lt 5`
`gte`	Context value is greater than or equal to target.	`confidence gte 0.8`
`lte`	Context value is less than or equal to target.	`retries lte 3`
`in`	Context value is a member of target collection.	`tool_name in ["read", "write"]`
`contains`	Target value is contained within context value.	`arguments contains "password"`
`matches`	Context value matches target regex pattern.	`tool_name matches "^exec_.*"`

Missing field behavior: If the condition references a context field that does not exist, the condition MUST evaluate to false. A missing field MUST NOT cause an error or exception.

For the matches operator, both values MUST be coerced to strings before regex evaluation. For all other operators, no implicit type coercion is performed.

Policy Actions

Action	Allowed	Semantics
`allow`	Yes	The action is permitted.
`deny`	No	The action is blocked. The agent MUST NOT proceed.
`audit`	Yes	The action is permitted but MUST be logged for review.
`block`	No	Alias for `deny`. The action is blocked.

An action is considered “allowing” if it is allow or audit. An action is considered “denying” if it is deny or block.

Evaluation Semantics

Evaluation Order

When evaluating a set of rules against an execution context:

Rules MUST be sorted by priority in descending order (highest priority first).
Rules MUST be evaluated in sorted order.
The first rule whose condition matches determines the decision. Subsequent rules are NOT evaluated.
If no rule matches and external backends are registered, backends MUST be consulted in registration order. The first backend returning a non-error result determines the decision.
If no rule matches and no backend produces a result, the default action from the policy’s defaults object is applied.

Scoped vs. Flat Evaluation

When a root_dir is configured and the execution context contains a path field, the evaluator MUST use folder-scoped evaluation (governance files discovered from the action path up to the root, loaded, filtered by scope, and merged before evaluation). When no root_dir is configured or the context lacks a path field, the evaluator MUST use flat evaluation against the loaded policy list.

Default Action Determination

Mode	Default Action Source
Flat evaluation	First loaded PolicyDocument’s `defaults.action`; `allow` if none loaded
Scoped evaluation	Most specific (last) PolicyDocument in the merged chain

Fail-Closed Behavior

Security boundary: Fail-closed behavior is a deliberate security design. Systems that default to allow on error create exploitable failure modes. Never change the default to fail-open.

If any unhandled exception occurs during policy evaluation, the implementation MUST:

Return a deny decision (allowed: false, action: "deny").
Include reason: "Policy evaluation error — access denied (fail closed)".
Include error: true in the audit entry.
Log the exception at ERROR level.

The deny decision MUST be produced even if the exception occurs in an external backend, a condition evaluator, or the conflict resolver.

Conflict Resolution

When multiple policy candidates conflict, the PolicyConflictResolver determines which rule wins using one of four strategies:

Strategy	Enum Value	Behaviour
Deny Overrides	`deny_overrides`	Any `deny` rule wins over any `allow`, regardless of priority.
Allow Overrides	`allow_overrides`	Any `allow` rule wins over any `deny`, regardless of priority.
Priority First Match	`priority_first_match`	Highest `priority` value wins regardless of action. Default.
Most Specific Wins	`most_specific_wins`	Agent scope > Tenant scope > Global scope; ties broken by priority.

Scope specificity order: Agent > Tenant > Global. Conforming implementations MUST support all four strategies. The strategy MUST be configurable at engine construction time.

Deny Immutability Invariant

During folder-level policy merge, a child policy MUST NOT override a parent deny rule with an allow rule, even if override: true is set. A child that attempts to override a parent deny MUST have its override dropped silently.

# Root governance.yaml (parent)
rules:
  - name: no-delete
    condition: { field: tool_name, operator: eq, value: delete_resource }
    action: deny
    priority: 200

# Subfolder governance.yaml (child — override MUST be dropped)
rules:
  - name: no-delete
    action: allow
    override: true   # ← IGNORED: parent deny is immutable

External Policy Backends

Implementations MUST support pluggable external policy backends for OPA/Rego and Cedar evaluation.

Backend Protocol

An external backend MUST implement:

evaluate(action: string, context: dict) → BackendDecision

where BackendDecision is one of "allow", "deny", or "review". Backends MUST be consulted only after local rules fail to match. If a backend returns an error, the engine MUST treat it as a deny (fail closed). Multiple backends are consulted in registration order; the first non-error result wins.

// TypeScript example
import { OPABackend, PolicyEngine } from '@microsoft/agent-governance-sdk';

const engine = new PolicyEngine([{ action: 'data.read', effect: 'allow' }]);
engine.registerBackend(
  new OPABackend({
    endpoint:   'https://opa.internal.example',
    policyPath: 'agentmesh/allow',
  }),
);

const result = await engine.evaluateWithBackends('data.read', { actor: 'alice' });
console.log(result.effectiveDecision);  // 'allow' | 'deny' | 'review'
console.log(result.backendResults);     // per-backend outcomes

Audit and Observability

Every policy evaluation MUST produce a structured audit entry containing:

Field	Type	Description
`timestamp`	datetime	UTC time of evaluation.
`agent_id`	string	Agent DID or identifier.
`action`	string	Action that was evaluated.
`decision`	string	Final decision: `allow`, `deny`, `audit`, `block`.
`matched_rule`	string or null	Name of the rule that matched, if any.
`policy_name`	string or null	Name of the policy document that matched.
`reason`	string	Human-readable explanation.
`evaluation_ms`	float	Duration of the evaluation in milliseconds.
`backend`	string or null	Name of the external backend consulted, if any.
`error`	boolean	`true` if the evaluation ended in a fail-closed error.

Audit entries MUST be emitted for every evaluation, including fail-closed error decisions.

Policy Composability

Folder-Level Hierarchy

When root_dir is configured, the engine discovers governance.yaml files from the action path up to the root directory. Policies are merged with the following rules:

Root policies apply to all paths (lowest specificity).
Subfolder policies apply only to their subtree (higher specificity).
Child policies MAY override parent rules with override: true, except parent deny rules (immutability invariant).
The inherit: false field stops the merge at that level.

Path Traversal Protection

Policy discovery MUST validate that action paths are within the configured root to prevent loading policies from arbitrary filesystem locations. Any path containing .. components MUST be rejected.

Conformance Requirements

A conforming implementation MUST:

Support the full PolicyDocument schema (fields, defaults, serialization).
Support all nine condition operators.
Support all four policy actions.
Implement priority-ordered, first-match evaluation.
Implement folder-level policy discovery and merge.
Enforce the deny immutability invariant.
Implement path traversal protection.
Support all four conflict resolution strategies.
Support the ExternalPolicyBackend protocol.
Enforce fail-closed semantics on all evaluation errors.
Produce structured audit entries for every decision.
Support YAML serialization for PolicyDocument.

Framework adapters using GovernancePolicy MUST additionally:

Validate policies at construction time.
Support exact, glob, and regex pattern types.
Enforce the tool call interception order.
Enforce concurrency limits.

The reference conformance test suite contains 68 tests covering all MUST requirements. Cross-language SDK compatibility is verified by running the same YAML policy files against each SDK’s evaluator.

Worked Examples

Basic Tool Blocking

version: "1.0"
name: "no-code-execution"
rules:
  - name: block-execute
    condition:
      field: tool_name
      operator: eq
      value: execute_code
    action: deny
    priority: 100
    message: "Code execution is not permitted in this environment"
defaults:
  action: allow

Context: {"tool_name": "execute_code", "agent_id": "assistant-1"} Expected: allowed: false, matched_rule: "block-execute", reason: "Code execution is not permitted in this environment"

Conflict Resolution — DENY_OVERRIDES

Candidates:

Rule “allow-read” from agent-scope policy: action: allow, priority: 50
Rule “block-all” from global policy: action: deny, priority: 10

Strategy: DENY_OVERRIDES Expected: Winner is “block-all” (deny overrides regardless of priority), conflict_detected: true

Fail-Closed on Exception

A malformed regex pattern at evaluation time MUST produce:

allowed: false, action: "deny"
reason: "Policy evaluation error — access denied (fail closed)"
error: true in the audit entry

Python SDK

Other SDKs

Specifications

Agent OS Policy Engine Specification 1.0 — RFC 2119

Status

Date

Conformance Tests

Policy Document Schema

Required and Optional Fields

Defaults Object

Policy Rule Schema

Condition Operators

Policy Actions

Evaluation Semantics

Evaluation Order

Scoped vs. Flat Evaluation

Default Action Determination

Fail-Closed Behavior

Conflict Resolution

Deny Immutability Invariant

External Policy Backends

Backend Protocol

Audit and Observability

Policy Composability

Folder-Level Hierarchy

Path Traversal Protection

Conformance Requirements

Worked Examples

Basic Tool Blocking

Conflict Resolution — DENY_OVERRIDES

Fail-Closed on Exception

Build docs developers (and LLMs) love

Python SDK

Other SDKs

Specifications

Documentation Index

Status

Date

Conformance Tests

​Policy Document Schema

​Required and Optional Fields

​Defaults Object

​Policy Rule Schema

​Condition Operators

​Policy Actions

​Evaluation Semantics

​Evaluation Order

​Scoped vs. Flat Evaluation

​Default Action Determination

​Fail-Closed Behavior

​Conflict Resolution

​Deny Immutability Invariant

​External Policy Backends

​Backend Protocol

​Audit and Observability

​Policy Composability

​Folder-Level Hierarchy

​Path Traversal Protection

​Conformance Requirements

​Worked Examples

​Basic Tool Blocking

​Conflict Resolution — DENY_OVERRIDES

​Fail-Closed on Exception

Build docs developers (and LLMs) love

Policy Document Schema

Required and Optional Fields

Defaults Object

Policy Rule Schema

Condition Operators

Policy Actions

Evaluation Semantics

Evaluation Order

Scoped vs. Flat Evaluation

Default Action Determination

Fail-Closed Behavior

Conflict Resolution

Deny Immutability Invariant

External Policy Backends

Backend Protocol

Audit and Observability

Policy Composability

Folder-Level Hierarchy

Path Traversal Protection

Conformance Requirements

Worked Examples

Basic Tool Blocking

Conflict Resolution — DENY_OVERRIDES

Fail-Closed on Exception