OpenShell handles inference traffic through two paths: requests to external hosts likeDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/NVIDIA/OpenShell/llms.txt
Use this file to discover all available pages before exploring further.
api.openai.com, and requests to inference.local, a special endpoint exposed inside every sandbox.
Two routing paths
| Path | How it works |
|---|---|
| External endpoints | Traffic to external hosts is treated like any other outbound request. It is allowed or denied by network_policies. See Policies for details. |
inference.local | A special HTTPS endpoint exposed inside every sandbox. The privacy router strips the sandbox-supplied credentials, injects the configured backend credentials, and forwards the request to the managed model endpoint. |
How inference.local works
When code inside a sandbox calls https://inference.local, the privacy router intercepts the request and routes it to the backend configured for that gateway. OpenShell applies the configured model to generation requests and supplies the provider credentials itself — no sandbox code needs access to the real API key.
If code calls an external inference host directly, that traffic bypasses inference.local entirely and is evaluated only by network_policies.
| Property | Detail |
|---|---|
| Credentials | No sandbox API keys needed. Credentials come from the configured provider record. |
| Configuration | One provider and one model define sandbox inference for the active gateway. Every sandbox on that gateway sees the same inference.local backend. |
| Provider support | NVIDIA NIM, any OpenAI-compatible provider, and Anthropic all work through the same endpoint. |
| Hot-refresh | Provider credential changes and inference updates propagate within about 5 seconds by default, without recreating sandboxes. |
The client-supplied
model and api_key values sent to inference.local are not forwarded upstream. The privacy router injects the real credentials from the configured provider and rewrites the model before forwarding.Supported API patterns
The patterns accepted byinference.local depend on the provider type configured for the gateway.
- OpenAI-compatible
- Anthropic-compatible
| Pattern | Method | Path |
|---|---|---|
| Chat Completions | POST | /v1/chat/completions |
| Completions | POST | /v1/completions |
| Responses | POST | /v1/responses |
| Model Discovery | GET | /v1/models |
| Model Discovery | GET | /v1/models/* |
Next steps
Configure inference routing
Set up the provider and model behind
inference.local.Sandbox policies
Control which external inference endpoints sandboxes can reach.