TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/techjarves/Odysseus-Portable/llms.txt
Use this file to discover all available pages before exploring further.
models/ folder at the project root is the single source of truth for all GGUF models in Odysseus Portable. When the llama.cpp backend starts, llama-server is launched with --models-dir models/, which causes it to scan that directory recursively and register every .gguf file it finds. This means you never need to edit a config file or pass a model path manually — simply placing a GGUF file anywhere inside models/ is enough for Odysseus to detect and serve it.
Adding Models
There are three supported ways to bring a model into your workspace:Web UI Cookbook
Open the Odysseus web interface at
http://127.0.0.1:7070 and navigate to the Models / Cookbook tab. From there you can search for and download any Hugging Face GGUF model directly into the models/ folder without leaving the browser. The Cookbook handles authentication (if you have a HUGGING_FACE_HUB_TOKEN set) and progress tracking automatically.Drag and Drop
Copy any
.gguf file directly into the models/ folder — or into a subfolder of your choosing. The file will be picked up the next time the backend starts or reloads. No renaming or registration is required.First-Launch CLI Prompt
On the very first run with the llama.cpp backend (or any time the
models/ folder is empty), the orchestrator pauses and shows an interactive terminal menu. It lists both any locally detected GGUF files and a curated set of predefined models available for download from Hugging Face. Select a number and press Enter — the orchestrator downloads the file and places it in models/ before starting llama-server.Directory Structure
Flat GGUF files placed directly at the top level ofmodels/ are available immediately with no additional processing:
hub/ subfolder:
Flat Symlinks and Hardlinks
Becausellama-server works best with flat model paths, the orchestrator automatically creates a flat symlink (or a hardlink on FAT32/exFAT drives that do not support symlinks) at the top level of models/ for every nested GGUF it discovers:
GGUF file sizes vary widely by model family and quantization level — from roughly 400 MB for a 0.5B Q4 model up to 5 GB for a 7B Q4 model. Plan your drive capacity accordingly. A 64 GB or 128 GB USB 3.0 SSD is recommended if you intend to keep multiple models available at once.
Switching Models
Odysseus Portable startsllama-server with --models-max 1, which means only one model occupies memory at a time. When a request arrives that targets a different model ID, llama-server unloads the current model and loads the requested one without requiring a server restart. The proxy layer handles model ID rewriting transparently, so your API client simply uses a standard Hugging Face repo identifier (e.g. Qwen/Qwen2.5-Coder-7B-Instruct-GGUF) and the backend takes care of the rest.
- Via API
- Via Odysseus UI
Send a chat completion request specifying the target model by its Hugging Face repo ID. The proxy rewrites it to the correct local filename automatically:
Removing Models
To remove a model, delete its.gguf file from the models/ folder (or subfolder). Because flat symlinks and hardlinks are fully regenerated on every launch, any stale top-level links that pointed to the deleted file are cleaned up automatically the next time the orchestrator starts. You do not need to track or delete the links yourself.