Documentation Index
Fetch the complete documentation index at: https://mintlify.com/NVIDIA/OpenShell/llms.txt
Use this file to discover all available pages before exploring further.
openshell inference
Manage the gateway-level inference configuration. The inference route determines which provider and model are used when sandbox workloads callinference.local — the managed LLM endpoint injected by the policy engine.
Inference configuration is scoped to the active gateway, not to individual sandboxes.
openshell inference set
Set the gateway-level inference provider and model. Overwrites the current configuration.Provider name (must already exist — see
openshell provider create). The provider’s credentials are used for all inference calls routed through inference.local.Model identifier to use for generation calls (e.g.,
gpt-4o, claude-opus-4-5, meta/llama-3.1-70b-instruct).Configure the system inference route instead of the user-facing route. The system route is used by platform functions (e.g., the agent harness) and is not accessible to user code.
Skip endpoint verification before saving the route. By default, the CLI makes a test request to confirm the provider and model are reachable.
Examples
openshell inference update
Partially update the gateway-level inference configuration. Only the flags you provide are changed; omitted flags leave the current values in place.Provider name. Unchanged if omitted.
Model identifier. Unchanged if omitted.
Target the system inference route instead of the user-facing route.
Skip endpoint verification before saving.
Examples
openshell inference get
Show the current gateway-level inference configuration.Show the system inference route instead of the user-facing route. When omitted, both routes are displayed.
Examples
How inference routing works
When a sandbox workload callsinference.local, the policy engine:
- Intercepts the outbound request.
- Strips the caller’s credentials.
- Injects the configured provider’s credentials.
- Forwards the request to the provider’s API.
The inference route must reference a provider that exists on the gateway. Create one first with
openshell provider create if you have not already.Supported provider types
Any provider type registered withopenshell provider create can be used as an inference provider, provided the underlying API is compatible with the OpenAI chat completions interface. The following types are tested and known to work:
| Provider type | Example model values |
|---|---|
openai | gpt-4o, gpt-4-turbo, o1 |
anthropic | claude-opus-4-5, claude-sonnet-4-5 |
nvidia | meta/llama-3.1-70b-instruct |
opencode | Depends on backing service |