Agents Best Practices: Provider-Neutral Harness Design

Agents Best Practices is a reference skill and architectural guide for building agentic harnesses — the control plane that wraps a language model in a production system. It covers the full runtime: tool design, permissions, planning, memory, compaction, observability, and safety, across any domain.

Quickstart

Get the skill running in your agent in under 5 minutes

Installation

Install via npx skills, manual clone, or agent prompt

MVP Blueprint

Generate a domain-specific MVP harness from scratch

Architecture

Understand the component model and harness boundaries

What is an agent harness?

An agent harness is the runtime control plane that sits between a user request and a language model’s actions. The model proposes; the harness validates, authorizes, executes, records, and returns observations. Without a harness, a model that can call tools is an unaudited operator. This skill teaches you to design the harness correctly:

Model-tool-observation loop — how calls, results, retries, and stopping rules connect
Typed tool registry — narrow, validated tools with declared risk classes
Permission engine — reads, drafts, writes, and destructive actions get different paths
Context builder — cache-aware assembly with retrieval and auto-compaction
Planning and goals — when to plan, how to commit, when to stop

Who this is for

Agents Best Practices applies to any team building an agent with real-world side effects. The pattern is the same whether the domain is:

Software engineering and code review
Research and data analysis
Customer support and CRM workflows
Finance, procurement, or legal review
Healthcare, HR, or education workflows
Operations, sales, or marketing automation

This is not only for coding agents. The harness discipline — loop, tools, permissions, planning, compaction, evals — is the same across every domain.

The central principle

“The model proposes actions; the harness validates, authorizes, executes, records, and returns observations.”

Eight runtime rules follow from this:

The harness acts, not the model — application code owns validation, authorization, execution, and logging
Every tool call gets a result — denials, timeouts, and aborts are observations too
Risk changes the loop — reads, drafts, writes, financial actions, and destructive actions need different permission paths
Draft and commit are separate — high-risk side effects require approval records outside the prompt
Context is built, not dumped — retrieve just enough, label trust boundaries, preserve active state across compaction
Long-running work needs budgets — step, time, token, cost, and tool-call budgets are part of the product
Skills and connectors are progressively disclosed — expose names first; load full workflows only when relevant
Repeated failures become harness features — validators, tools, evals, and policies beat repeating prompt advice

Explore the skill

Agentic Loop

Loop invariants, retries, budgets, and stopping rules

Tools and Permissions

Typed tools, risk classes, and approval flows

Context and Memory

Retrieval, compaction, and durable state

Security and Evals

Guardrails, tracing, evals, and launch gates

Get Started

Core Concepts

Building Agents

Advanced Topics

Production

Agents Best Practices: Provider-Neutral Harness Design

Quickstart

Installation

MVP Blueprint

Architecture

What is an agent harness?

Who this is for

The central principle

Explore the skill

Agentic Loop

Tools and Permissions

Context and Memory

Security and Evals

Build docs developers (and LLMs) love

Get Started

Core Concepts

Building Agents

Advanced Topics

Production

Documentation Index

Quickstart

Installation

MVP Blueprint

Architecture

​What is an agent harness?

​Who this is for

​The central principle

​Explore the skill

Agentic Loop

Tools and Permissions

Context and Memory

Security and Evals

Build docs developers (and LLMs) love

What is an agent harness?

Who this is for

The central principle

Explore the skill