Zero-Trust Identity and Trust Scoring for AI Agents

In a multi-agent system, any agent can claim to be anything. Without cryptographic identity and continuous trust evaluation, you have no way to answer three critical questions: Who is this agent? Should I trust it? What can it do? A real-world deployment without AGT might have five agents sharing one API key, making incident attribution impossible. The Agent Governance Toolkit solves this with three layers: a cryptographic identity layer backed by Ed25519 key pairs and Decentralized Identifiers (DIDs), a trust scoring layer that continuously rates behavioral trustworthiness on a 0–1000 scale, and a credential lifecycle layer with short-lived, auto-rotating tokens scoped to specific capabilities and resources.

Agent Identity

AgentIdentity

Every agent is represented by an AgentIdentity record that binds a DID, an Ed25519 key pair, a human sponsor, and a set of capabilities. The private key never appears in serialized output.

Field	Type	Description
`did`	AgentDID	The agent’s decentralized identifier (`did:mesh:<unique-id>`)
`name`	string	Human-readable name (must not be empty or whitespace-only)
`public_key`	string	Base64-encoded Ed25519 public key
`verification_key_id`	string	`key-<first-16-hex-chars-of-SHA256(public_key)>`
`sponsor_email`	string	Email of the human sponsor (must contain `@`)
`status`	enum	One of: `active`, `suspended`, `revoked`
`capabilities`	list[string]	Granted capabilities
`delegation_depth`	int	Position in the delegation chain (0 = root)
`parent_did`	string or null	Parent agent DID if this is a delegated identity

from agentmesh import AgentIdentity

# Create an agent with a human sponsor
agent = AgentIdentity.create(
    name="DataProcessor",
    sponsor="alice@company.com",
    capabilities=["read:data", "write:reports"],
    organization="Analytics",
)

print(agent.did)          # did:mesh:a1b2c3d4e5f6...
print(agent.public_key)   # Base64-encoded Ed25519 public key
print(agent.status)       # "active"

# Sign data with the private key
signature = agent.sign(b"payload to authenticate")

# Verify the signature using only the public key
is_valid = agent.verify_signature(b"payload to authenticate", signature)
print(is_valid)  # True

# Export as JWK (JSON Web Key) for interoperability
jwk = agent.to_jwk(include_private=False)

# Export as a W3C DID Document
did_doc = agent.to_did_document()

DID Format

An Agent DID follows the format did:mesh:<unique-id>, where the unique ID is at least 128 bits of cryptographically secure randomness (hex-encoded, 32 characters).

from agentmesh import AgentDID

did = AgentDID.generate()
print(did)  # did:mesh:7f3a9b2c1d4e5f6a...

# Parse an existing DID string
did = AgentDID.from_string("did:mesh:7f3a9b2c...")
print(did.method)     # "mesh"
print(did.unique_id)  # "7f3a9b2c..."

AGT exports W3C DID Documents in the standard format:

{
  "@context": ["https://www.w3.org/ns/did/v1"],
  "id": "did:mesh:<unique-id>",
  "verificationMethod": [{
    "id": "did:mesh:<unique-id>#<key-id>",
    "type": "Ed25519VerificationKey2020",
    "controller": "did:mesh:<unique-id>",
    "publicKeyBase64": "<base64-public-key>"
  }],
  "authentication": ["did:mesh:<unique-id>#<key-id>"],
  "service": [{
    "id": "did:mesh:<unique-id>#agentmesh",
    "type": "AgentMeshIdentity",
    "serviceEndpoint": "https://mesh.agentmesh.dev/v1"
  }]
}

SPIFFE Credentials

For workload identity in mTLS contexts, AGT integrates with SPIFFE (Secure Production Identity Framework for Everyone). Each SVID (SPIFFE Verifiable Identity Document) binds an AgentMesh DID to a standard SPIFFE ID:

spiffe://{trust_domain}/agentmesh[/{organization}]/{agent_name}

SVIDs have a configurable TTL (default 1 hour) and automatically rotate when fewer than 10 minutes remain before expiry. Every AgentIdentity must be bound to a human sponsor via sponsor_email. This is the foundational accountability mechanism — every agent action traces to a responsible person. The sponsor_verified flag indicates whether the sponsor’s identity has been independently verified (e.g., via email confirmation or SSO). No agent can operate without a sponsor.

from agentmesh import HumanSponsor

sponsor = HumanSponsor.create(
    email="alice@company.com",
    name="Alice",
    organization="Analytics",
    allowed_capabilities=["read:data", "write:reports", "execute:analysis"],
)

sponsor.verify(method="email")

print(sponsor.max_agents)           # 10
print(sponsor.max_delegation_depth) # 3

if sponsor.can_sponsor_agent():
    agent = AgentIdentity.create(
        name="NewAgent",
        sponsor=sponsor.email,
        capabilities=["read:data"],
    )

Trust Score Algorithm

The 0–1000 Scale

AGT maintains a trust score for every agent on a 0–1000 integer scale. Scores are clamped to this range on every update. New agents receive a default score of 500 (TRUST_SCORE_DEFAULT). A trust ceiling can further cap an agent’s maximum score, preventing privilege escalation through behavioral accumulation.

Trust Tiers

Tier	Minimum Score	Description
`verified_partner`	900	Highest trust, verified partner agent — full access
`trusted`	700	Trusted agent with good track record — standard operations
`standard`	500	Default tier for new agents — limited operations, monitored
`probationary`	300	Below-normal trust, under observation
`untrusted`	0	No trust — should be restricted or quarantined

Tier assignment uses a descending threshold check: the first threshold met (highest score first) determines the tier.

Key Operational Thresholds

Action	Threshold	Description
Allow	≥ 500	Actions generally permitted
Warn	< 400	Trigger warnings and enhanced monitoring
Revoke	< 300	Trigger automatic credential revocation

Reward Dimensions

The full multi-dimensional trust score is composed from five weighted dimensions:

Dimension	Weight	What It Measures
`policy_compliance`	0.25	Adherence to governance policies
`security_posture`	0.25	Security behavior, credential handling, input validation
`output_quality`	0.20	Quality of agent outputs and results
`resource_efficiency`	0.15	Efficient use of compute, memory, API calls
`collaboration_health`	0.15	Quality of interactions with other agents

Dimension scores update using an exponential moving average with a smoothing factor of 0.1: new_score = current_score * 0.9 + (signal_value * 100) * 0.1.

Trust Decay

Trust scores decay linearly at 2.0 points per hour when no positive signals are received. Decay does not reduce a score below 100. An agent that stops producing positive behavioral evidence will see its score drift downward automatically.

Network Propagation (Trust Contagion)

When a trust event occurs for agent A, the impact propagates to agents that have interacted with A. The propagation factor is 0.3; impact halves at each hop; maximum propagation depth is 2 hops. This makes trust contagion deterrent: a compromised agent damages the trust scores of its close collaborators.

from agentmesh import RiskScorer
from agentmesh.identity.risk import RiskSignal

scorer = RiskScorer()

# Get or create a score for an agent
score = scorer.get_score("did:mesh:abc123...")
print(score.total_score)  # 500 (default)
print(score.risk_level)   # "medium"

# Report a risk signal (value: 0.0 = no risk, 1.0 = maximum risk)
scorer.add_signal(
    agent_did="did:mesh:abc123...",
    signal=RiskSignal(
        signal_type="behavior.anomaly",
        severity="high",
        value=0.8,          # 0.0 = no risk, 1.0 = max risk
        source="anomaly_detector",
        details="Unusual data access pattern detected",
    ),
)

# Score recalculates automatically
updated_score = scorer.recalculate("did:mesh:abc123...")

# Find agents that have decayed into high-risk territory
high_risk_agents = scorer.get_high_risk_agents(threshold=400)

Regime Detection

AGT detects sudden behavioral shifts via KL divergence between recent (last 1 hour) and baseline (last 30 days) action distributions. If KL > 0.5, a RegimeChangeAlert is emitted with the divergence value, both distributions, and detection timestamp. This identifies potential compromise or adversarial takeover.

Delegation Chains

An agent can delegate a subset of its capabilities to a child agent. Capabilities can only narrow — the child can never receive more than the parent holds. Wildcard "*" cannot be delegated.

# Parent agent with broad capabilities
parent = AgentIdentity.create(
    name="OrchestratorAgent",
    sponsor="alice@company.com",
    capabilities=["read:data", "write:reports", "execute:analysis"],
)

# Delegate a narrower set to a child
child = parent.delegate(
    name="ReportWriter",
    capabilities=["write:reports"],  # Must be a strict subset of parent's
)

print(child.parent_did)                        # Parent's DID
print(child.delegation_depth)                  # 1
print(child.has_capability("write:reports"))   # True
print(child.has_capability("read:data"))       # False — not delegated

Each delegation creates a DelegationLink with a cryptographic parent_signature over the link data. The ScopeChain represents the full path from root sponsor to leaf agent and enforces hash-chain integrity across all links.

from agentmesh import ScopeChain

# Create a root chain from a human sponsor
chain, root_link = ScopeChain.create_root(
    sponsor_email="alice@company.com",
    root_agent_did="did:mesh:root...",
    capabilities=["read:data", "write:reports", "execute:analysis"],
    sponsor_verified=True,
)

# Verify the entire chain is intact
is_valid, error = chain.verify()
print(is_valid)  # True

# Trace how a specific capability was granted
trace = chain.trace_capability("write:reports")
# Returns the full delegation path for that capability

Chain invariants:

delegated_capabilities must be a subset of parent_capabilities at each link.
Each link’s previous_link_hash must equal the preceding link’s link_hash.
Maximum delegation depth is 10 by default (MAX_DELEGATION_DEPTH).

Trust Ceiling Propagation

When a parent delegates to a child and sets max_initial_trust_score, the child’s effective trust score can never exceed min(parent_ceiling, requested_ceiling). This provides monotonic narrowing of trust through the delegation chain and prevents trust washing — an attacker cannot spawn sub-agents to obtain higher trust scores than the parent.

Inter-Agent Trust Negotiation via AgentMesh

Before two agents communicate, they perform a cryptographic IATP (Inter-Agent Trust Protocol) handshake: challenge-response with Ed25519 signatures, nonce verification, trust score exchange, and capability negotiation.

Agent A initiates a challenge

import asyncio
from agentmesh import TrustHandshake, AgentIdentity

agent_a = AgentIdentity.create(
    name="AgentA",
    sponsor="alice@company.com",
    capabilities=["read:data", "write:reports"],
)

handshake_a = TrustHandshake(
    agent_did=str(agent_a.did),
    identity=agent_a,
    timeout_seconds=30.0,
)

Agent A sends the challenge to Agent B

result = asyncio.run(handshake_a.initiate(
    peer_did="did:mesh:agent_b...",
    required_trust_score=700,            # Minimum Ring 1
    required_capabilities=["read:data"],
))

Evaluate the result

if result.verified:
    print(f"Peer: {result.peer_did}")
    print(f"Trust Level: {result.trust_level}")  # "trusted"
    print(f"Trust Score: {result.trust_score}")
    print(f"Latency: {result.latency_ms}ms")
else:
    print(f"Rejected: {result.rejection_reason}")

The signed payload format for challenge-response is: {challenge_id}:{challenge_nonce}:{response_nonce}:{agent_did}. Challenges expire after 30 seconds. The initiator verifies the signature against the peer’s registered public key before accepting the handshake result. Successful handshakes are cached for 15 minutes (cache_ttl=900). Set require_freshness=True to bypass the cache and force a live verification on every call. The HandshakeResult fields:

Field	Type	Description
`verified`	bool	Whether the handshake succeeded
`peer_did`	string	Peer’s DID
`trust_score`	int	Verified trust score (0–1000)
`trust_level`	string	`verified_partner`, `trusted`, `standard`, or `untrusted`
`capabilities`	list[string]	Verified capabilities
`latency_ms`	int or null	Round-trip latency
`rejection_reason`	string or null	Reason for failure

For persistent communication partners, TrustBridge maintains verified peer state:

from agentmesh import TrustBridge

bridge = TrustBridge(
    agent_did=str(agent_a.did),
    default_trust_threshold=700,
)

bridge.register_peer(
    peer_did="did:mesh:agent_b...",
    peer_name="DataAnalyzer",
    protocol="iatp",
)

# Verify a peer (uses cached result if available)
result = asyncio.run(bridge.verify_peer(
    peer_did="did:mesh:agent_b...",
    required_trust_score=700,
    required_capabilities=["read:data"],
))

# Quick trust check
is_trusted = asyncio.run(bridge.is_peer_trusted(
    peer_did="did:mesh:agent_b...",
    required_score=700,
))

Azure Entra ID Linkage

When sponsor_verified=True and the sponsor’s email is verified through an organizational directory (SSO, Azure Entra ID), the AgentIdentity record links the agent to its organizational identity. This enables:

Audit trail entries that resolve to a named human in your HR or identity system.
Trust score initialization at above-standard tiers for agents sponsored by verified principals.
Integration with Azure RBAC for resource access decisions that combine agent trust scores with principal-level permissions.

The organization and organization_id fields on AgentIdentity carry the Entra ID tenant context. The TrustBridge can use these to federate trust decisions across organizational boundaries.

Credential Lifecycle

Short-lived credentials (default 15-minute TTL) are the runtime proof that an agent is authorized to act. Only the SHA-256 hash of each token is stored — the raw token is returned to the caller once at issuance and treated as a secret.

from agentmesh import CredentialManager

manager = CredentialManager(default_ttl=900)  # 15 minutes

# Issue a scoped credential
cred = manager.issue(
    agent_did="did:mesh:abc123...",
    capabilities=["read:data"],
    resources=["dataset_sales", "dataset_inventory"],
    ttl_seconds=900,
)

print(cred.credential_id)    # "cred_a1b2c3..."
print(cred.status)            # "active"
print(cred.to_bearer_token()) # "Bearer <token>"

# Validate an incoming token (constant-time comparison)
incoming_token = request.headers["Authorization"].removeprefix("Bearer ")
cred = manager.validate(incoming_token)

if cred and cred.is_valid():
    if cred.has_capability("read:data"):
        if cred.can_access_resource("dataset_sales"):
            # Authorized — proceed
            pass

# Rotate before expiry (threshold: 60 seconds)
cred = manager.rotate_if_needed(cred.credential_id)

# Revoke immediately on compromise
manager.revoke(cred.credential_id, reason="Suspected compromise")

# Revoke ALL credentials for a compromised agent
count = manager.revoke_all_for_agent(
    agent_did="did:mesh:compromised...",
    reason="Agent suspended pending investigation",
)

Token verification uses hmac.compare_digest (constant-time comparison) to prevent timing side-channel attacks. Never use standard string equality for token comparison.

Get Started

Core Concepts

Guides

Compliance

Reference

Zero-Trust Identity and Trust Scoring for AI Agents

Agent Identity

AgentIdentity

DID Format

SPIFFE Credentials

Trust Score Algorithm

The 0–1000 Scale

Trust Tiers

Key Operational Thresholds

Reward Dimensions

Trust Decay

Network Propagation (Trust Contagion)

Regime Detection

Delegation Chains

Trust Ceiling Propagation

Inter-Agent Trust Negotiation via AgentMesh

Azure Entra ID Linkage

Credential Lifecycle

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Compliance

Reference

Documentation Index

​Agent Identity

​AgentIdentity

​DID Format

​SPIFFE Credentials

​Human Sponsor Binding

​Trust Score Algorithm

​The 0–1000 Scale

​Trust Tiers

​Key Operational Thresholds

​Reward Dimensions

​Trust Decay

​Network Propagation (Trust Contagion)

​Regime Detection

​Delegation Chains

​Trust Ceiling Propagation

​Inter-Agent Trust Negotiation via AgentMesh

​Azure Entra ID Linkage

​Credential Lifecycle

Build docs developers (and LLMs) love

Agent Identity

AgentIdentity

DID Format

SPIFFE Credentials

Human Sponsor Binding

Trust Score Algorithm

The 0–1000 Scale

Trust Tiers

Key Operational Thresholds

Reward Dimensions

Trust Decay

Network Propagation (Trust Contagion)

Regime Detection

Delegation Chains

Trust Ceiling Propagation

Inter-Agent Trust Negotiation via AgentMesh

Azure Entra ID Linkage

Credential Lifecycle