This guide takes you from a fresh clone or download of USB-Uncensored-LLM to a fully running chat UI in three steps. The process is the same whether you’re working from a physical USB drive, an external SSD, or a local folder on your primary machine — run the installer for your OS, pick a model, and launch.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/techjarves/USB-Uncensored-LLM/llms.txt
Use this file to discover all available pages before exploring further.
Before you begin, check that your drive and host machine meet the minimum hardware thresholds. See System Requirements for storage, RAM, and USB speed guidance.
Initialize the Engine
The install script detects your operating system and downloads the correct Ollama engine binary (~50 MB) into
Shared/bin/. This is the only step that requires an internet connection.- Windows
- macOS
- Linux
Double-click The script launches an interactive PowerShell menu that walks you through engine setup and model selection.
Windows/install.bat in File Explorer.If the script closes instantly, right-click the file and select Run as Administrator instead — this resolves a known conflict with Windows App Execution Aliases.Initialization downloads the small (~50 MB) Ollama engine binary specific to your OS into the
Shared/bin/ folder. Your model weights are not downloaded in this step — that happens in Step 2. You only need to run the installer once per OS you plan to use; the engine binary persists on the drive for future sessions.Choose and Download a Model
After the engine is initialized, the installer presents an interactive numbered catalog of curated uncensored models. Enter the number of the model you want to download; it will be saved into Recommended for first-time users: Choose
Shared/models/ and is immediately available to all OS launchers on the drive.[1] Gemma 2 2B Abliterated (~1.6 GB). It is the smallest model in the catalog, downloads quickly, runs fast on any hardware with 8 GB RAM, and delivers strong performance for its size. You can always download additional models later by re-running the install script.The [C] CUSTOM option lets you paste any direct .gguf download URL from HuggingFace — the installer will download it into Shared/models/ alongside the curated options.Launch the Chat UI
Once the engine is initialized and at least one model is downloaded, run the start script for your OS. The Ollama engine starts silently in the background, and your default web browser opens automatically at
http://localhost:3333.- Windows
- macOS
- Linux
Double-click The script starts the Ollama engine, waits for it to come online, then launches the Python chat server. Your browser opens to
Windows/start-fast-chat.bat in File Explorer, or run it from a terminal:http://localhost:3333. Keep the terminal window open — closing it shuts down the engine and chat server.What Happens Next
Chat history is saved automatically. Every conversation is written toShared/chat_data/ in real time. Because this folder lives on your portable drive, your history is available regardless of which machine you plug into — your conversation from a Windows laptop last Tuesday is waiting for you when you plug into a Linux desktop today.
Moving to a new machine is a one-step process. When you plug your drive into a different computer for the first time, run that machine’s install script once to download the OS-specific engine binary into Shared/bin/. Your model weights are already on the drive — no re-download required. After initialization, use the start script as normal for every subsequent session on that machine.
LAN access from your phone. When the start script runs, the terminal displays a local network IP address (e.g., http://192.168.1.15:3333). Any device on the same WiFi network can open that address in a browser to use the chat UI — no app installation needed on the mobile device.