Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/techjarves/Odysseus-Portable/llms.txt

Use this file to discover all available pages before exploring further.

Odysseus Portable ships with two fully offline inference backends: llama.cpp and Ollama. Both run entirely on your machine without sending data to any external service, but they differ in how they acquire models, which GPU runtimes they support, and how much control they give you over memory usage. Understanding these differences lets you pick the right engine from the start — and switch later if your needs change.

Backend Comparison

Featurellama.cppOllama
Model formatGGUF files in models/ folderPulled via ollama pull
Model storagemodels/ inside project foldermodels/ollama/ inside project folder
API endpointhttp://127.0.0.1:8080/v1 (proxy)http://127.0.0.1:11434/v1
Context auto-scalingYes — retries with smaller context on OOMNo
GPU supportCUDA, Vulkan, Metal, CPUCUDA, Metal, CPU
Best forPortable GGUF files, USB drivesConvenient model management via web UI

Selecting a Backend

Odysseus Portable gives you four ways to choose your backend, from most ephemeral to most persistent.
1

Interactive prompt on first launch

When no configuration exists yet, the launcher presents a menu at startup. Enter the number for the backend you want:
[1] Ollama
[2] llama.cpp
Your choice is saved to data/launcher_config.json so it persists across future launches.
2

CLI flag

Pass --backend= when invoking the start script to override whatever is stored in config:
./start.sh --backend=llama
3

Environment variable

Set ODYSSEUS_BACKEND before running the launcher. This is useful in scripts or CI-like environments where you don’t want to modify any files:
ODYSSEUS_BACKEND=llama ./start.sh
# or
ODYSSEUS_BACKEND=ollama ./start.sh
4

Persistent config file

Edit data/launcher_config.json directly to set a permanent default. The launcher reads this file on every start:
{
  "backend": "llama"
}
Replace "llama" with "ollama" to switch. CLI flags and environment variables still take precedence over this file.

Which Backend Should I Use?

llama.cpp

Best when you need maximum portability — copy the entire project folder to a USB drive, external SSD, or another machine and everything works out of the box. The built-in context auto-scaling means it gracefully handles low-VRAM situations by automatically stepping down to a smaller context window instead of crashing. Supports CUDA, Vulkan, Metal, and CPU.

Ollama

Best when you prefer a polished model-management experience. Use the Cookbook/Models section in the Odysseus web UI to browse, pull, and switch models without leaving the browser. Ollama’s library covers a broad range of quantised models and its familiar CLI is well-documented. Supports CUDA, Metal, and CPU.

Build docs developers (and LLMs) love