Use a Local Ollama Instance with Hermes USB Portable

Hermes Portable can connect to any OpenAI-compatible API endpoint — including a locally running Ollama server. This means you can run inference entirely on your own hardware: no API keys required, no data leaves your machine, and no per-token costs. It is ideal for air-gapped environments, sensitive workflows, or simply experimenting with open-weight models at no cost.

Prerequisites

Ollama Installed

Ollama must be installed on the same machine (or a reachable host). Download it from ollama.com.

Ollama Running

The Ollama server must be started before launching Hermes. By default it listens on http://127.0.0.1:11434.

Connect Hermes to Local Ollama

Start Ollama and pull a model

Open a terminal and pull the model you want to use. For example:

ollama pull qwen3:latest

You can pull any model supported by Ollama. Run ollama list to see what is already available on your machine.

Launch Hermes Portable and open Setup

Start the launcher for your platform:

launch.bat

From the interactive menu, select [2] Setup / Reconfigure Hermes.You can also run setup directly from the command line:

launch.bat setup

Choose Quick setup → More providers → Custom endpoint

Inside the Hermes setup wizard:

Choose Quick setup.
Select More providers.
Select Custom endpoint (enter URL manually).

Enter the local Ollama endpoint

When prompted for the endpoint URL, enter:

http://127.0.0.1:11434/v1

Ollama exposes an OpenAI-compatible /v1 API. The /v1 suffix is required — Hermes uses the OpenAI client format to communicate with it.

Leave the API key blank

When the wizard asks for an API key, press Enter to leave it empty. Ollama does not require authentication when running locally.

Select your model and auto-detect context length

The wizard will detect models available from the running Ollama instance. Select the model you pulled in Step 1.When prompted for context length, press Enter to leave it blank — Hermes will auto-detect the correct value from the model’s metadata.

Hermes is now configured to use your local Ollama instance. Start a chat session from the launcher menu ([1] Start Hermes Chat) to begin.

Remote Ollama Host

If Ollama is running on a different machine on your network, use the same /v1 endpoint format with the remote host’s IP address or hostname. For example:

http://192.168.1.20:11434/v1

Verify the remote Ollama host is reachable before running setup — use ping or open the URL in a browser to confirm the server responds. Ollama must also be configured to accept connections from external interfaces (set OLLAMA_HOST=0.0.0.0 on the remote machine if needed).

Follow the same wizard steps as above, substituting the remote URL when prompted for the endpoint.

Summary

Setting	Value
Endpoint (local)	`http://127.0.0.1:11434/v1`
Endpoint (remote example)	`http://192.168.1.20:11434/v1`
API Key	(leave blank)
Context length	(leave blank to auto-detect)
API format	OpenAI-compatible `/v1`

Get Started

Configuration

Using Hermes

Maintenance

Troubleshooting

Use a Local Ollama Instance with Hermes USB Portable

Prerequisites

Ollama Installed

Ollama Running

Connect Hermes to Local Ollama

Remote Ollama Host

Summary

Build docs developers (and LLMs) love

Get Started

Configuration

Using Hermes

Maintenance

Troubleshooting

Documentation Index

​Prerequisites

Ollama Installed

Ollama Running

​Connect Hermes to Local Ollama

​Remote Ollama Host

​Summary

Build docs developers (and LLMs) love

Prerequisites

Connect Hermes to Local Ollama

Remote Ollama Host

Summary