Android support in USB-Uncensored-LLM works through Termux, a Linux terminal emulator that runs a full Debian-like environment on your device. Unlike the Windows, macOS, and Linux installers — which use a pre-built Ollama binary — the Android installer clones theDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/techjarves/USB-Uncensored-LLM/llms.txt
Use this file to discover all available pages before exploring further.
llama.cpp source repository and compiles it natively on your device for your exact ARM64 processor. This compilation step takes 10–30 minutes but produces a binary perfectly tuned to your hardware, giving maximum inference performance. No PC is required at any stage.
Prerequisites
- Termux installed from F-Droid (link above)
- Android device with an ARM64 processor (virtually all modern Android phones and tablets)
- 6 GB+ RAM (8 GB+ strongly recommended; only the 2B model runs reliably on 6 GB devices)
- ~4 GB free storage (more if you want to keep the compiled build artifacts)
- WiFi or mobile data for initial setup — required for the
aptpackage installation, thellama.cppgit clone, and the model download
Installation
Install Termux from F-Droid
Navigate to https://f-droid.org/en/packages/com.termux/ on your Android device and install the Termux APK. Open Termux after installation and wait for the bootstrap to complete.
Copy the project to your device
Transfer the USB-Uncensored-LLM folder to your Android device using one of these methods:
- USB OTG cable — connect the USB drive and copy the folder to internal storage
- File transfer — copy over MTP/USB from a PC
- Git clone — inside Termux, run
git clone <repo-url>
Navigate to the project folder in Termux
In Termux, use
cd to navigate to the USB-Uncensored-LLM root directory. If you copied it to your Android Downloads folder, it will typically be at ~/storage/downloads/USB-Uncensored-LLM. Run termux-setup-storage first if you need to access shared storage.Run the Android installer
Build tools are installed automatically
The installer updates and upgrades the package list via This ensures all compilers and build utilities needed for the llama.cpp compilation are present.
apt, then installs build tools via pkg:llama.cpp is cloned and compiled natively
The installer clones the llama.cpp repository to The compiled
Shared/bin/llama.cpp/ and compiles it for your ARM64 processor:llama-server binary is copied to Shared/bin/llama-server-android.Launching
Shared/bin/llama-server-android with the first .gguf file found in Shared/models/, acquires a Termux wakelock to prevent Android from killing the process, and waits up to 90 seconds for the server to become ready on port 8080.
Once the engine is online, you are given a choice of UI:
[1] opens http://localhost:3333 (the full chat server). Selecting [2] opens http://localhost:8080 (the raw llama.cpp interface). The browser is opened automatically using Android’s intent system:
Performance Tips
Android vs Desktop Architecture
The Android installer uses a fundamentally different execution path from the desktop platforms:| Aspect | Desktop (Windows / macOS / Linux) | Android |
|---|---|---|
| Engine | Pre-built Ollama binary (ollama-windows.exe, ollama-darwin, ollama-linux) | Natively compiled llama-server-android from llama.cpp source |
| API port | Ollama API on port 11434 | OpenAI-compatible API on port 8080 |
| Chat server flag | Standard mode | python chat_server.py --no-browser --llama-cpp |
| Model format | Imported into Ollama’s internal registry | Raw .gguf files passed directly via -m flag |
--llama-cpp flag tells the Python chat server to translate Ollama API calls into the OpenAI-compatible format that llama-server’s /v1/ endpoint expects. This means the same FastChatUI.html interface works identically on all platforms.
The llama-server is started with -c 2048 -cb -np 4 --port 8080, limiting context to 2048 tokens to fit in mobile RAM while allowing up to 4 parallel slots.
LAN Access from Android
Oncebash Android/start.sh is running and the engine is online, other devices on the same WiFi network can access your Android-hosted AI. The launcher detects your local IP address and displays the network URL in the terminal. Simply navigate to that address in any browser on the same network: