Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/AsyncFuncAI/deepwiki-open/llms.txt

Use this file to discover all available pages before exploring further.

Ollama lets you run open-source language models directly on your own hardware. When DeepWiki is configured to use Ollama, the entire pipeline — embedding, retrieval, and wiki generation — runs locally. No API keys, no usage costs, and no data leaves your machine.
Set DEEPWIKI_EMBEDDER_TYPE=ollama alongside the Ollama generator to achieve a fully offline setup. This means both code indexing and documentation generation happen on your local hardware with no cloud calls at all.

Why use Ollama with DeepWiki?

  • Free: No per-token charges from any cloud provider
  • Private: Source code never leaves your machine
  • Offline: Works without an internet connection after models are downloaded
  • No API key required: Skip the sign-up and key management overhead

Hardware requirements

For acceptable performance with the default models:
ComponentMinimumRecommended
CPU4 cores8+ cores
RAM8 GB16 GB+
Storage10 GB free20 GB+ free
GPUOptionalStrongly recommended
A dedicated GPU significantly speeds up both embedding and generation. Without one, expect generation to be noticeably slower than cloud APIs.

Available Ollama models

The following generator models are pre-configured in api/config/generator.json:
ModelContext windowUse case
qwen3:1.7b (default)32,000 tokensGood balance of speed and quality
llama3:8b8,000 tokensHigher quality, slower
qwen3:8b32,000 tokensBest quality, requires more RAM
You can use any model available in the Ollama library by pulling it and updating api/config/generator.json. For embeddings, DeepWiki uses nomic-embed-text when DEEPWIKI_EMBEDDER_TYPE=ollama.

Setup steps

1

Install Ollama

curl -fsSL https://ollama.com/install.sh | sh
Confirm Ollama is running:
ollama list
2

Pull the required models

Pull the embedding model and at least one generator model:
# Embedding model (required for DEEPWIKI_EMBEDDER_TYPE=ollama)
ollama pull nomic-embed-text

# Generator model — pick one
ollama pull qwen3:1.7b    # default, recommended starting point
ollama pull llama3:8b     # higher quality, more RAM required
ollama pull qwen3:8b      # best quality, most resource-intensive
The first pull downloads several gigabytes of model weights; subsequent runs use the local cache.
3

Clone the repository

git clone https://github.com/AsyncFuncAI/deepwiki-open.git
cd deepwiki-open
4

Configure the Ollama embedder

Replace the default embedder configuration with the Ollama-specific one:
cp api/config/embedder.ollama.json.bak api/config/embedder.json
When prompted to confirm the overwrite, enter y.
5

Create a .env file

No API keys are needed for a fully local setup:
# No cloud API keys required
PORT=8001

# Only set this if Ollama is running on a remote host
# OLLAMA_HOST=http://192.168.1.50:11434

# Use Ollama for embeddings (fully offline)
DEEPWIKI_EMBEDDER_TYPE=ollama
If Ollama is running on the same machine, OLLAMA_HOST defaults to http://localhost:11434 and does not need to be set.
6

Start the backend

python -m pip install poetry==2.0.1 && poetry install -C api
python -m api.main
7

Start the frontend

In a second terminal:
npm install
npm run dev
8

Generate a wiki with Ollama

  1. Open http://localhost:3000
  2. Enter a GitHub, GitLab, or Bitbucket repository URL
  3. Enable the Use Local Ollama Model option in the model selector
  4. Click Generate Wiki

Docker + Ollama setup

If you prefer to run DeepWiki inside Docker while Ollama runs on the host, use the dedicated Ollama Dockerfile:
1

Configure the embedder

cp api/config/embedder.ollama.json.bak api/config/embedder.json
2

Build the Ollama-specific image

docker build -f Dockerfile-ollama-local -t deepwiki:ollama-local .
3

Run the container

# Standard use
docker run -p 3000:3000 -p 8001:8001 --name deepwiki \
  -v ~/.adalflow:/root/.adalflow \
  -e OLLAMA_HOST=your_ollama_host \
  deepwiki:ollama-local
To also analyse a local repository by mounting it into the container:
docker run -p 3000:3000 -p 8001:8001 --name deepwiki \
  -v ~/.adalflow:/root/.adalflow \
  -e OLLAMA_HOST=your_ollama_host \
  -v /path/to/your/repo:/app/local-repos/repo-name \
  deepwiki:ollama-local
When using a mounted local repository, enter /app/local-repos/repo-name as the repository path in the DeepWiki UI.
On Apple Silicon Macs, the Dockerfile automatically selects ARM64 binaries for better native performance.

Using a custom generator model

To switch to any model available in ollama list, edit api/config/generator.json and change the "model" value under the ollama provider:
"ollama": {
  "default_model": "qwen3:1.7b",
  "supportsCustomModel": true,
  "models": {
    "qwen3:1.7b": {
      "options": {
        "temperature": 0.7,
        "top_p": 0.8,
        "num_ctx": 32000
      }
    }
  }
}
You can also enter a custom model identifier directly in the DeepWiki frontend without editing any config files, because supportsCustomModel is set to true.

Troubleshooting

“Cannot connect to Ollama server”
  • Run ollama list to verify Ollama is running
  • Check that nothing else is using port 11434
  • Try restarting Ollama
Slow generation
  • Local models are inherently slower than cloud APIs
  • Start with qwen3:1.7b (smallest, fastest) and only move to larger models if quality is insufficient
  • A GPU with at least 8 GB VRAM significantly improves throughput
Out of memory errors
  • Use a smaller model such as phi3:mini
  • Close other memory-intensive applications before running Ollama
  • Reduce the num_ctx value in generator.json to lower memory usage

Build docs developers (and LLMs) love