Ployz is a primitive orchestration core for small clusters. Its architectural bet is simple: small-scale infrastructure gets better when the system exposes real operational primitives instead of hiding them behind policy engines, controllers, and reconciler loops. Every state change is an explicit operation — add a machine, deploy a workload, migrate state, branch an environment, promote, roll back — with visible preconditions, a bounded effect, a clear result, and a way to verify what happened.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/getployz/ployz/llms.txt
Use this file to discover all available pages before exploring further.
The core model
Ployz stores enough durable state to execute and explain explicit operations. It does not store a standing desired-state document that background controllers continuously reconcile. Policy belongs at decision time. The operator decides to add, migrate, deploy, branch, promote, roll back, or remove. Ployz executes that operation with clear preconditions and returns a result the operator can inspect. There are no autoscalers, hidden schedulers, or self-healing loops silently rewriting cluster truth. A mutating operation is foreground work with an audience. It should:- Inspect current intent and live preconditions
- Build a plan when the operation has meaningful choices
- Fail before mutation when preconditions are missing
- Execute bounded steps against concrete participants
- Commit durable facts at the point of no return
- Report cleanup, partial progress, or failure explicitly
- Leave enough state for safe retry or operator repair
Three kinds of state
Ployz separates three kinds of truth:- Intent — what an operator explicitly asked the cluster to do
- Status — durable lifecycle facts emitted by operations
- Observation — live reachability, health, capacity, and freshness checked at decision time
Disposable daemon, durable data plane
ployzd is a disposable control plane. It can crash, upgrade, or restart without disrupting the data plane. WireGuard tunnels stay up, NATS keeps serving state, the gateway keeps proxying, DNS keeps resolving, and workload containers keep running.
On startup, the daemon adopts what is already running and only recreates managed infrastructure whose identity has drifted.
System boundaries
Ployz is organized into four layers that interact through explicit contracts.Operator surfaces
Orchestration kernel
Runtime and substrate backends
Data plane services
ployzd is absent: workload containers, WireGuard mesh, NATS, gateway, DNS, and storage datasets. Daemon restart must not restart workloads.Core domains
Code is organized by domain, not by adapter pattern.| Domain | Responsibility |
|---|---|
machine | Machine identity, membership, join, update, remove, and operator surfaces for capacity |
mesh | WireGuard overlay lifecycle, peer state, subnet coordination, and mesh phase state |
store | Durable cluster facts, subscriptions, locks, streams, KV records, and memory/NATS implementations |
coordination | Leases, participant commands, explicit foreground coordination, and failure reporting |
deploy | Preview, placement, participant probing, apply, commit, cleanup, and deploy lifecycle facts |
runtime | Local container/process operations through narrow backend contracts |
storage | Volume creation, snapshot, clone, transfer, receive, migration, and rollback mechanics |
routing | Route facts, gateway projection, DNS projection, and freshness handling |
services | Long-lived sidecar supervision for NATS, gateway, DNS, and supporting processes |
daemon | Composition root, request handling, startup adoption, and operation dispatch |
SDK/API | External command surface and structured request/response types |
Runtime targets
Runtime selection happens at the daemon composition root. Core domains receive explicit backends instead of matching on an operator-facing mode enum.| Runtime target | Service mode | Meaning |
|---|---|---|
| Docker | User | Docker-backed mesh/store/sidecars with loopback control-plane binding |
| Host | User | Host-backed mesh/store, child-process sidecars, overlay control-plane binding |
| Host | System | Host-backed mesh/store, system-managed sidecars, overlay control-plane binding |
Memory is test-only. It is not an operator-facing runtime and does not shape the daemon’s public API.Docker runtime on macOS
The daemon runs on the macOS host. Everything else runs inside Docker Desktop’s Linux VM. NATS, gateway, and DNS bind on the node’s overlay IPv6 address so other mesh nodes can reach them directly. In the Docker runtime they share theployz-networking network namespace to access wg0.
OverlayBridge uses userspace WireGuard and a smoltcp TCP stack to bridge the macOS host to the container overlay network. eBPF TC classifiers intercept and redirect traffic at the kernel level where the runtime supports it.
Upgrade and adoption contract
The daemon separates ephemeral control-plane work from persistent data-plane services.| Component | Restart behavior |
|---|---|
| Workloads | Never touched by daemon restart |
| Gateway | Adopted if running and config matches; recreated on drift |
| DNS | Adopted if running and config matches; recreated on drift |
| NATS | Adopted if running and parent netns unchanged; recreated on drift |
| WireGuard | Adopted if healthy |
| CLI RPC, remote deploy, background command listeners | Ephemeral, restarted with daemon |
ployz.config-hash and ployz.parent-container-id. System services compare rendered unit identity.
Explore further
Cluster coordination with NATS
Routing, gateway, and DNS
Design test
When evaluating a proposed change to Ployz architecture, ask these questions:Does this create a new primitive or hide a procedure behind policy?
Does this create a new primitive or hide a procedure behind policy?
Can the operation fail before mutation when preconditions are missing?
Can the operation fail before mutation when preconditions are missing?
Does durable state record intent and lifecycle facts rather than inferred liveness?
Does durable state record intent and lifecycle facts rather than inferred liveness?
Can a human or agent verify the result without knowing hidden background behavior?
Can a human or agent verify the result without knowing hidden background behavior?
Does daemon restart leave the data plane serving last good state?
Does daemon restart leave the data plane serving last good state?
ployzd is disposable. Any design that causes a daemon restart to interrupt WireGuard, NATS, the gateway, DNS, or running workloads has broken the separation between control plane and data plane. The daemon adopts; it does not own the data plane’s lifecycle.Does the design keep local, self-hosted, cloud, and future agent surfaces on one model?
Does the design keep local, self-hosted, cloud, and future agent surfaces on one model?
ployzctl dev on a Mac and a fleet operator running production share the same primitives. There is no dev-mode shortcut and no cloud-only mechanism. If a feature requires a separate model for one of these surfaces, the primitive needs to be strengthened, not forked.