Before downloading a single binary, Odysseus Portable needs to know exactly what hardware it is running on. TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/techjarves/Odysseus-Portable/llms.txt
Use this file to discover all available pages before exploring further.
detectHardware() function in src/system.js answers that question in a single synchronous sweep: it probes process.platform, process.arch, native OS commands for RAM capacity, and a cascade of GPU detection methods to determine whether the host machine can accelerate inference with CUDA, Vulkan, or Metal — or must fall back to CPU. The result is a plain object that flows through every binary selection and inference configuration decision made during startup.
Return Value Structure
gpuMemoryGB and gpuFreeMemoryGB are only populated for CUDA GPUs via nvidia-smi. Metal and Vulkan detection does not query VRAM, so those fields remain 0 for Apple Silicon and Vulkan devices. Context window sizing for those backends falls back to Math.max(4, Math.floor(hw.ramGB * 0.55)) — 55 % of system RAM with a 4 GB floor.OS and Architecture Detection
OS is mapped from Node.js’sprocess.platform string to a shorter internal identifier:
process.arch. Any value other than 'x64' or 'arm64' is normalised to 'x64' in the return value to keep downstream logic simple:
os.cpus() array, which is available on all platforms without shelling out.
RAM Detection
RAM detection uses a different native command on each platform to get the most accurate physical memory figure:- Windows
- macOS
- Linux
Queries WMI through PowerShell and converts bytes to GB:
ramGB defaults to 8.
GPU Detection Flow
GPU detection follows a strict priority cascade. Each step is only attempted if all previous steps have not produced a result:Apple Silicon Metal (macOS ARM64)
Checked first, before any external commands are run. Any macOS machine with This check is unconditional — there is no way to have an
arch === 'arm64' is an Apple Silicon Mac with a unified memory GPU that supports Metal:arm64 macOS system without Metal support.NVIDIA CUDA (nvidia-smi)
Runs VRAM is reported in MiB by
nvidia-smi with structured output to get the GPU name, total VRAM, and free VRAM in a single call:nvidia-smi and divided by 1024 to produce GB rounded to one decimal place. Only the first GPU in a multi-GPU system is used.Vulkan
Attempted when neither Metal nor CUDA is detected. The check is filesystem-based on Windows and command-based on Linux:Windows — checks for the Vulkan runtime DLL in the system directory:Linux — first tries On Windows, if Vulkan is confirmed, the GPU device name is resolved with
which vulkaninfo, then falls back to checking common library paths:vulkaninfo --summary via PowerShell and parsed from the deviceName = field.How Detection Drives Binary Selection
Thehw object returned by detectHardware() is passed directly to getLlamaCppAssets(hw) in src/downloader.js, which uses it to select the correct asset from the llama.cpp GitHub release:
hw.os | hw.gpuBackend | hw.arch | Asset pattern matched |
|---|---|---|---|
win | cuda | x64 | llama-*-bin-win-cuda*12.4*.zip |
win | vulkan | x64 | llama-*-bin-win-vulkan*.zip |
win | cpu | x64 | llama-*-bin-win-cpu*x64*.zip |
macos | metal | arm64 | llama-*-bin-macos-arm64*.tar.gz |
macos | metal | x64 | llama-*-bin-macos-x64*.tar.gz |
linux | cuda | x64 | *-cuda-*amd64.tar.gz (ai-dock fork) |
linux | vulkan | x64 | llama-*-bin-ubuntu-vulkan*x64*.tar.gz |
linux | cpu | x64 | llama-*-bin-ubuntu-x64*.tar.gz |
getLlamaCppAssets targets the ai-dock/llama.cpp-cuda repository instead of the main ggml-org/llama.cpp repository, which ships pre-linked CUDA binaries.
How Detection Drives Inference Settings
Two inference parameters are derived directly from the hardware profile: GPU layer offload (ngl) — controls how many model layers are offloaded to the GPU. A CPU-only machine gets ngl = 0; any recognised GPU backend (CUDA, Vulkan, or Metal) gets ngl = 99 (all layers):
- CPU inference (
hw.gpuBackend === 'cpu'): useshw.ramGBdirectly. - GPU inference (CUDA / Vulkan / Metal): uses
hw.gpuFreeMemoryGB || hw.gpuMemoryGB || Math.max(4, Math.floor(hw.ramGB * 0.55)). When both VRAM fields are zero (e.g. Metal, Vulkan without a query), the fallback is 55 % of system RAM capped at a minimum of 4 GB.