Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/NVIDIA/OpenShell/llms.txt

Use this file to discover all available pages before exploring further.

openshell inference

Manage the gateway-level inference configuration. The inference route determines which provider and model are used when sandbox workloads call inference.local — the managed LLM endpoint injected by the policy engine. Inference configuration is scoped to the active gateway, not to individual sandboxes.

openshell inference set

Set the gateway-level inference provider and model. Overwrites the current configuration.
openshell inference set --provider NAME --model MODEL [OPTIONS]
--provider
string
required
Provider name (must already exist — see openshell provider create). The provider’s credentials are used for all inference calls routed through inference.local.
--model
string
required
Model identifier to use for generation calls (e.g., gpt-4o, claude-opus-4-5, meta/llama-3.1-70b-instruct).
--system
boolean
Configure the system inference route instead of the user-facing route. The system route is used by platform functions (e.g., the agent harness) and is not accessible to user code.
--no-verify
boolean
Skip endpoint verification before saving the route. By default, the CLI makes a test request to confirm the provider and model are reachable.

Examples

# Set the inference route to GPT-4o via an OpenAI provider
openshell inference set --provider openai --model gpt-4o

# Set to Claude via an Anthropic provider
openshell inference set --provider anthropic --model claude-opus-4-5

# Set the system inference route
openshell inference set --provider openai --model gpt-4o --system

# Set without verifying the endpoint (useful in offline/staging setups)
openshell inference set --provider openai --model gpt-4o --no-verify

openshell inference update

Partially update the gateway-level inference configuration. Only the flags you provide are changed; omitted flags leave the current values in place.
openshell inference update [OPTIONS]
--provider
string
Provider name. Unchanged if omitted.
--model
string
Model identifier. Unchanged if omitted.
--system
boolean
Target the system inference route instead of the user-facing route.
--no-verify
boolean
Skip endpoint verification before saving.

Examples

# Swap to a different model, keeping the same provider
openshell inference update --model gpt-4-turbo

# Change provider and model together
openshell inference update --provider anthropic --model claude-sonnet-4-5

openshell inference get

Show the current gateway-level inference configuration.
openshell inference get [OPTIONS]
--system
boolean
Show the system inference route instead of the user-facing route. When omitted, both routes are displayed.

Examples

# Show all inference routes
openshell inference get

# Show only the system route
openshell inference get --system

How inference routing works

When a sandbox workload calls inference.local, the policy engine:
  1. Intercepts the outbound request.
  2. Strips the caller’s credentials.
  3. Injects the configured provider’s credentials.
  4. Forwards the request to the provider’s API.
This keeps model API credentials out of sandbox environments entirely — agents call a stable local endpoint and never see the underlying keys.
The inference route must reference a provider that exists on the gateway. Create one first with openshell provider create if you have not already.

Supported provider types

Any provider type registered with openshell provider create can be used as an inference provider, provided the underlying API is compatible with the OpenAI chat completions interface. The following types are tested and known to work:
Provider typeExample model values
openaigpt-4o, gpt-4-turbo, o1
anthropicclaude-opus-4-5, claude-sonnet-4-5
nvidiameta/llama-3.1-70b-instruct
opencodeDepends on backing service

Build docs developers (and LLMs) love