pi-steering exports aDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/cad0p/pi-steering-hooks/llms.txt
Use this file to discover all available pages before exploring further.
pi-steering/testing subpath with primitives that exercise the full evaluation pipeline without booting pi. loadHarness builds the same evaluator and observer dispatcher pair that production uses; expectBlocks and expectAllows then drive events through it and assert verdicts in a single call. All tests are deterministic and CI-safe — no pi runtime stub, no file-system walk-up, no live exec unless you explicitly wire one.
Import path
Harness-level testing
loadHarness accepts a static SteeringConfig and returns a Harness — a build-once, invoke-many handle whose evaluate and dispatch methods are identical in signature to the production runtime’s. Tests that call harness.evaluate directly are invoking exactly the same code path pi does on every tool_call event.
expectBlocks
Assert that the harness returns a block verdict for the given event. Throws on allow.
{ command }— bash tool call{ write: { path, content } }— write tool call{ edit: { path, edits } }— edit tool call
rule checks that the named rule fired (matched against the [steering:<rule>@<source>] prefix in the reason string; the source suffix is ignored). reason accepts an exact string or a RegExp for a looser match. Omit both to assert only that something blocked.
expectAllows
Assert that the harness allows the given event — no rule fires.
Full example from the work-item-plugin
The canonical plugin example wires a minimal plugin (just the predicate the rule needs) to keep each rule test focused:
Unit-level testing
For predicates and observers that you want to exercise in complete isolation — without standing up a full harness — the testing subpath providestestPredicate and testObserver.
testPredicate
Build a PredicateContext from MockContextOptions and call the handler directly. Returns the boolean verdict.
MockContextOptions fields your predicate reads. The defaults fill in cwd: "/tmp/test", an empty env map, and an exec stub that rejects loudly if accidentally called — so a test that forgets to wire exec fails explicitly rather than silently evaluating against an empty result.
testObserver
Drive an observer’s onResult handler at a synthetic tool_result event and inspect what it appended.
watchMatched tells you whether the observer’s watch filter accepted the event — onResult is only called when it does, mirroring production dispatch. entries is the list of appendEntry writes captured from the handler.
The npm-test-tracker test suite from the work-item-plugin example demonstrates the full contract:
MockContextOptions knobs
testPredicate (and mockContext directly) accept these options:
The effective cwd the predicate sees via
ctx.cwd and ctx.walkerState.cwd.The engine’s agent-loop counter. Rules using
when.happened: { in: "agent_loop" } scope filtering compare this against _agentLoopIndex on session entries — set it to match the entries you supply via priorEntry.Which tool the predicate is evaluating under. Drives the default shape of
input when no input is supplied.Full tool input. Derived from
tool when omitted: bash gives { command: "" }, write gives { path: "", content: "" }, edit gives { path: "", edits: [] }.Walker-state snapshot the predicate sees. Merged over
{ cwd, env: new Map() } — supply only the fields your predicate reads. Pass { branch: "main" } to test a branch predicate without wiring the full git tracker.Stub for
ctx.exec. Defaults to rejecting with a clear error — tests that call out to exec must stub explicitly.Prior session entries
findEntries reads from. Build entries with priorEntry(customType, data, { agentLoopIndex }) so the _agentLoopIndex tag is stamped exactly as the live engine would stamp it — a hand-rolled literal with a typo on the key name silently fails to match when.happened scope filtering.Per-ref speculative events for
when.happened: { in: "tool_call" } simulation. Keys are the customType event literals; values are synthetic entries — the same shape the walker-level speculative-synthesis pass produces. Use this to test &&-chain allow logic in isolation without a full harness.Adversarial matrices
An adversarial matrix pins blocking behavior across every shell variant that should trigger a rule — and confirms that false-friend inputs that look like the pattern but should not match are correctly allowed.runMatrix never throws; failures surface in result.cases so all cases are evaluated even when early ones fail.
formatMatrix renders an ASCII-friendly report suited to CI log aggregators:
MatrixCase accepts an expect of "block", "allow", or { block: true, rule: "rule-name" } to also pin which rule fired. The optional cwd field on each case overrides the fallback "/tmp/test" for cwd-scoped rules.
LoadHarnessOptions
The
SteeringConfig to test with. Passed directly into the merger pipeline — same resolvePlugins + buildEvaluator + buildObserverDispatcher path production uses.Prepend
DEFAULT_RULES and DEFAULT_PLUGINS to the config at the innermost position. Mirrors the production !config.disableDefaults flag, but kept explicit here so tests can exercise default rules without editing the config under test.Note: unlike production, loadHarness does not throw on error-class diagnostics. It surfaces them in harness.diagnostics so plugin-author tests can assert on them directly.Custom host to drive
exec and appendEntry. Defaults to an in-memory stub whose exec rejects with a clear error and whose appendEntry is a silent sink. Pass a createRecordingHost() instance when a test needs to inspect what the engine wrote across multiple calls — for example, to verify a self-marking onFire rule wrote the correct session entry before the block verdict returned.Available exports from pi-steering/testing
All primitives are available from the pi-steering/testing subpath. They are also re-exported from the package root for discoverability.
| Export | Purpose |
|---|---|
loadHarness | Build an evaluator + dispatcher pair from a static SteeringConfig |
expectBlocks | Assert an event is blocked; optionally pin which rule and reason |
expectAllows | Assert an event is allowed |
expectRuleFires | Thin alias over expectBlocks for tests focused on which rule fired |
runMatrix | Batch-evaluate a list of cases; never throws — failures surface in result.cases |
formatMatrix | Render a MatrixResult as an ASCII-friendly CI report |
testPredicate | Drive a single PredicateHandler against a mockContext |
testObserver | Drive an Observer.onResult at a synthetic event; captures appendEntry writes |
mockContext | Build a PredicateContext for unit-testing predicates in isolation |
mockExtensionContext | Build a minimal ExtensionContext stub backed by a RecordedSessionEntry array |
mockObserverContext | Build an ObserverContext for unit-testing observer handlers |
priorEntry | Build a MockEntry with the _agentLoopIndex tag stamped correctly |
createRecordingHost | Build a recording EvaluatorHost that captures every exec and appendEntry call |
getAppendedEntries | Read the appendEntry capture buffer for a mock context |