USB-Uncensored-LLM uses a deliberate two-tier layout to solve the most expensive problem in cross-platform portability: you should never have to download a 5 GB model twice. Platform-specific binaries — the Ollama engine, runtime libraries, and OS launchers — are isolated inside per-OS folders (Documentation Index
Fetch the complete documentation index at: https://mintlify.com/techjarves/USB-Uncensored-LLM/llms.txt
Use this file to discover all available pages before exploring further.
Windows/, Linux/, Mac/, Android/), each completely independent of the others. Everything that is heavy and OS-agnostic — model weights, chat history, settings, and the Python chat server — lives in a single Shared/ volume that every OS folder reads from and writes to. The result is that plugging the drive into a new machine only requires downloading the small (~50 MB) engine binary for that OS; the gigabytes of model weights are already there.
Top-Level Layout
The root of the USB drive (or project clone) looks like this:The Shared/ Volume
Every subdirectory underShared/ has a specific responsibility. The start scripts on each platform set environment variables pointing here before launching the engine.
bin/
Contains the Ollama engine executables, one per supported platform:
| File | Platform |
|---|---|
ollama-windows.exe | Windows |
ollama-linux | Linux |
ollama-darwin | macOS |
llama-server-android | Android (Termux, ARM64) |
bin/ directory serves all platforms without conflict.
chat_data/
Holds the two persistent JSON state files written and read by chat_server.py:
chats.json— the full conversation history array. Every chat session ever created is stored here, regardless of which OS launched the server.settings.json— user preferences:temperature,globalSystemPrompt, andlogMode. These settings survive reboots, OS switches, and drive moves.
Shared/, your chat history is available whether you boot from Windows or Linux on the same drive.
config/
Contains JSON catalog files consumed by the installer scripts:
models.json— the curated model catalog: names, download URLs, file sizes, and HuggingFace paths. Read byscripts/config_query.pyduringinstall.batto render the interactive model selection menu.ui-vendor-assets.json— a manifest of offline UI assets to download during install (fonts, markdown renderer, PDF viewer).
lib/ollama/
Holds the Ollama runtime libraries installed by the Linux and macOS installers, including the llama-server backend binary. This directory is created automatically by Linux/install.sh and Mac/install.command. Windows uses a self-contained ollama-windows.exe that bundles its own runtime, so it does not use this folder.
logs/
Contains the rotating chat server log file:
chat_server.log— structured, multi-line log entries written bychat_server.py. The file rotates at 10 MB with 1 backup file kept (chat_server.log.1). Log verbosity is controlled at runtime viaPOST /api/settingswithlogMode: "all"orlogMode: "errors_only".
models/
The largest subdirectory on the drive — all AI model data lives here:
| Path | Contents |
|---|---|
models/*.gguf | Raw GGUF weight files (e.g. gemma-2-2b-abliterated.gguf) |
models/ollama_data/ | Ollama’s internal model registry and manifests |
models/Modelfile-<name> | Ollama Modelfile configs (system prompt, parameters) |
models/installed-models.txt | Plain-text record of which models have been set up |
OLLAMA_MODELS environment variable is set to Shared/models/ollama_data by every start script before launching the engine, ensuring Ollama always reads from and writes to the USB regardless of host machine defaults.
python/
A portable Python 3.12 embed distribution for Windows. Python is not guaranteed to be installed on every Windows machine, so start-fast-chat.bat checks for Shared/python/python.exe first. If it exists, the bat file uses it directly. If it doesn’t, the script auto-downloads Python 3.12 embed (~11 MB) and extracts it here. This directory is Windows-only; Linux and macOS have system Python available via python3.
scripts/
Shared utility scripts used across platforms:
| Script | Purpose |
|---|---|
config_query.py | Parses config/models.json for installer menus |
download-ui-assets.sh | Fetches offline vendor assets on Linux/macOS |
download-ui-assets.ps1 | Fetches offline vendor assets on Windows |
uninstall-common.sh | Shared cleanup logic used by OS-specific uninstallers |
vendor/
Offline copies of UI dependencies downloaded during install:
- Markdown renderer (used to format AI responses in the browser)
- Web fonts
- PDF viewer assets
Why This Layout?
The cross-platform benefit becomes clear when you use the drive across multiple machines:First machine (Windows)
Run
Windows/install.bat. This downloads ollama-windows.exe to Shared/bin/ and your selected model weights (e.g. a 5.2 GB .gguf file) to Shared/models/.Second machine (Linux)
Run
bash Linux/install.sh. This downloads only ollama-linux to Shared/bin/ and the Ollama runtime libraries to Shared/lib/ollama/. The model weights in Shared/models/ are already there — no re-download.Portable Python (Windows)
On Linux and macOS, the start scripts use the systempython3 interpreter. On Windows, no Python is guaranteed. Windows/start-fast-chat.bat resolves this with a three-step fallback:
pip. It runs chat_server.py using only the standard library, which is all the server requires. The download happens once; subsequent launches find Shared/python/python.exe and skip straight to step 1.
The portable Python embed is amd64 only. On ARM64 Windows devices, the system Python (if installed) will be used instead. If neither is present on an ARM64 machine, download a compatible Python 3.12 embed zip manually and extract it to
Shared/python/.