Troubleshooting

Ollama issues

Ollama isn’t responding

Symptoms:

“Connection refused” errors
Scripts hang when trying to connect
No response from ollama list

Solutions:

Check if Ollama is running

# Test if Ollama is running
ollama list

# If you see your models, Ollama is running
# If you get an error, start the service:
ollama serve

Verify the model is downloaded

# List downloaded models
ollama list

# If llama3.1 is missing, download it:
ollama pull llama3.1

Check firewall settings

Ollama defaults to localhost:11434. Ensure no firewall is blocking this port.

# Test the connection
curl http://localhost:11434/api/tags

Ollama must be running BEFORE you start any demo script. Run ollama serve in a separate terminal if it’s not running as a background service.

Model not found error

Error message:

Error: model 'llama3.1' not found

Solution:

# Download the model
ollama pull llama3.1

# Verify it's available
ollama list

Connection refused on macOS

Symptoms:

Ollama works in terminal but not in Python scripts
“Connection refused to localhost:11434”

Solution: Ollama might not be running as a service. Start it manually:

# Terminal 1: Start Ollama
ollama serve

# Terminal 2: Run your demo script
python scripts/demo_step1_ollama.py

Performance issues

Very slow generation (< 3 tokens/sec)

Symptoms:

Responses take minutes to generate
High CPU usage
System feels sluggish

Solutions:

Close other applications

Llama 3.1 8B uses 4-5 GB of RAM. Close unnecessary applications to free memory:

Web browsers with many tabs
Electron apps (Slack, Discord, VS Code)
Video conferencing apps
Other LLM tools

Switch to a smaller model

Try a smaller, faster model:

# Download a smaller model
ollama pull llama3.2:3b

# Update the model in your scripts:
# Settings.llm = Ollama(model="llama3.2:3b")

Check CPU vs GPU inference

On Apple Silicon, check Activity Monitor:

Look for “ollama” process
If “CPU” is high but “GPU” is low, the model might not be using the Neural Engine
Restart Ollama: killall ollama && ollama serve

CPU-only inference for 8B models typically runs at 3-8 tokens/second. This is normal and still works for live demos since the audience can see it generating in real-time.

Index building is slow (> 30 seconds)

Symptoms:

Step 2 hangs during “Building vector index…”
First run takes much longer than subsequent runs

Solutions:

First run: Embedding model download

The HuggingFace embedding model (all-MiniLM-L6-v2) downloads on first use (~80 MB).Fix: Run Step 2 once before your live demo to cache the model:

python scripts/demo_step2_rag.py city

Subsequent runs will use the cached model from ~/.cache/huggingface/hub/.

Check internet connection

If the download is stuck, verify your internet connection:

# Test HuggingFace connectivity
curl -I https://huggingface.co

App is unresponsive

Symptoms:

Gradio app loads but doesn’t respond to queries
Browser shows loading spinner indefinitely
Terminal shows no error messages

Solutions:

Check available RAM

Ensure at least 8 GB RAM is available:macOS:

# Check memory pressure
vm_stat | grep "Pages free"

Linux:

free -h

Increase request timeout

Edit the timeout in your script:

Settings.llm = Ollama(
    model="llama3.1",
    request_timeout=300.0  # Increase from 120 to 300 seconds
)

Gradio issues

Browser doesn’t open automatically

Symptoms:

Script starts but browser doesn’t launch
Terminal shows “Running on local URL: http://localhost:7860”

Solutions:

Open manually

Navigate to the URL shown in the terminal:

http://localhost:7860

Specify browser

Set the BROWSER environment variable:

# macOS/Linux
BROWSER=chrome python scripts/demo_step3_app.py

# Or Firefox
BROWSER=firefox python scripts/demo_step3_app.py

Port already in use

Error message:

OSError: [Errno 48] Address already in use

Solutions:

Kill the existing process

# Find the process using port 7860
lsof -ti:7860 | xargs kill -9

# Or for port 8861 (Step 5)
lsof -ti:8861 | xargs kill -9

Use a different port

# Run on custom port
python scripts/demo_step3_app.py --port 8080

Symptoms:

--share flag creates a URL but it’s not accessible
“Could not create share link” error

Solutions:

Check internet connection

Gradio share requires internet to create the tunnel:

# Test connectivity
curl -I https://gradio.app

Firewall blocking tunnel

Some networks block Gradio’s tunneling service. Try:

Use a different network (mobile hotspot)
Deploy to Hugging Face Spaces instead

Embedding model issues

Download hangs or fails

Symptoms:

Step 2 hangs at “Building vector index…”
“ConnectionError” or “TimeoutError”

Solutions:

Check internet connection

The embedding model downloads from HuggingFace (~80 MB):

# Test HuggingFace connectivity
curl -I https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

Retry with better network

If at a hackathon with slow wifi, try:

Move closer to the router
Use a wired connection
Pair with someone who already has the model cached
Copy their ~/.cache/huggingface/ directory

Clear cache and retry

# Remove corrupted downloads
rm -rf ~/.cache/huggingface/hub/models--sentence-transformers--all-MiniLM-L6-v2

# Retry
python scripts/demo_step2_rag.py city

”No module named ‘sentence_transformers’”

Solution: Reinstall dependencies:

pip install --upgrade llama-index-embeddings-huggingface

Cached model location

The embedding model is cached at:

~/.cache/huggingface/hub/models--sentence-transformers--all-MiniLM-L6-v2/

You can copy this directory to another machine to avoid re-downloading.

Python dependency issues

Import errors

Error message:

ModuleNotFoundError: No module named 'llama_index'

Solution: Ensure you’re in the virtual environment and dependencies are installed:

# Activate virtual environment
source .venv/bin/activate    # macOS/Linux
.venv\Scripts\activate       # Windows

# Reinstall dependencies
pip install --upgrade pip
pip install -r requirements.txt

Version conflicts

Symptoms:

“Incompatible version” warnings
Scripts fail with attribute errors

Solution:

# Create fresh virtual environment
rm -rf .venv
python3 -m venv .venv
source .venv/bin/activate

# Install clean dependencies
pip install --upgrade pip
pip install -r requirements.txt

BYOD (Bring Your Own Data) issues

File not found

Error message:

Error: File not found at path/to/file.txt

Solutions:

Check file path

Use absolute paths or relative paths from the project root:

# Absolute path
python scripts/demo_step4_byod.py ~/Downloads/report.pdf

# Relative to project root
python scripts/demo_step4_byod.py userdata/myfile.txt

Remove drag-and-drop quotes

If you dragged a file into the terminal, remove the quotes:

# Wrong (with quotes)
python scripts/demo_step4_byod.py "~/Downloads/my file.txt"

# Right (script handles spaces)
python scripts/demo_step4_byod.py ~/Downloads/my\ file.txt

Unsupported file type

Error message:

Error: Unsupported file type: .xlsx

Supported formats:

.txt (plain text)
.pdf (requires llama-index-readers-file)
.csv (comma-separated values)
.docx (Word documents)

Solution for Excel files: Convert to CSV first:

# In Excel: File > Save As > CSV
# Then use the CSV file
python scripts/demo_step4_byod.py data.csv

PDF parsing fails

Symptoms:

Empty or garbled text from PDFs
“No content found” errors

Solutions:

Check PDF type

LlamaIndex works best with text-based PDFs. Image-based PDFs (scanned documents) require OCR.Test: Try copying text from the PDF. If you can’t select text, it’s image-based.

Install additional readers

pip install llama-index-readers-file

Live demo backup plan

Always have a backup plan for live demos. Hardware and wifi can fail.

Pre-record a screen capture

Before the event:

Record the full demo flow using screen recording software
Record at the venue so the environment looks authentic
Include terminal commands, browser interactions, and real responses
Keep the video file ready to play if needed

Pre-warm checklist

Run these commands before going on stage:

# 1. Verify Ollama is running
ollama list

# 2. Test Step 1
python scripts/demo_step1_ollama.py

# 3. Pre-warm Step 2 (all tracks)
python scripts/demo_step2_rag.py eco
python scripts/demo_step2_rag.py city
python scripts/demo_step2_rag.py edu
python scripts/demo_step2_rag.py justice

# 4. Test Step 3 app loads
python scripts/demo_step3_app.py
# Open browser, verify it works, then Ctrl+C

# 5. Close all other applications
# 6. Set terminal font size large enough for back row

Getting help

If you’re still stuck:

Check the architecture overview to understand how components connect
Review the deployment options if you’re trying to host remotely
Open an issue on GitHub with:
- Your OS and Python version
- Full error message
- Steps to reproduce

Getting Started

Tutorial Steps

Civic Data

Customization

Reference

Ollama issues

Ollama isn’t responding

Model not found error

Connection refused on macOS

Performance issues

Very slow generation (< 3 tokens/sec)

Index building is slow (> 30 seconds)

App is unresponsive

Gradio issues

Browser doesn’t open automatically

Port already in use

Embedding model issues

Download hangs or fails

”No module named ‘sentence_transformers’”

Cached model location

Python dependency issues

Import errors

Version conflicts

BYOD (Bring Your Own Data) issues

File not found

Unsupported file type

PDF parsing fails

Live demo backup plan

Pre-record a screen capture

Pre-warm checklist

Getting help

Build docs developers (and LLMs) love

Getting Started

Tutorial Steps

Civic Data

Customization

Reference

Documentation Index

​Ollama issues

​Ollama isn’t responding

​Model not found error

​Connection refused on macOS

​Performance issues

​Very slow generation (< 3 tokens/sec)

​Index building is slow (> 30 seconds)

​App is unresponsive

​Gradio issues

​Browser doesn’t open automatically

​Port already in use

​Share link doesn’t work

​Embedding model issues

​Download hangs or fails

​”No module named ‘sentence_transformers’”

​Cached model location

​Python dependency issues

​Import errors

​Version conflicts

​BYOD (Bring Your Own Data) issues

​File not found

​Unsupported file type

​PDF parsing fails

​Live demo backup plan

​Pre-record a screen capture

​Pre-warm checklist

​Getting help

​Related resources

Build docs developers (and LLMs) love

Ollama issues

Ollama isn’t responding

Model not found error

Connection refused on macOS

Performance issues

Very slow generation (< 3 tokens/sec)

Index building is slow (> 30 seconds)

App is unresponsive

Gradio issues

Browser doesn’t open automatically

Port already in use

Share link doesn’t work

Embedding model issues

Download hangs or fails

”No module named ‘sentence_transformers’”

Cached model location

Python dependency issues

Import errors

Version conflicts

BYOD (Bring Your Own Data) issues

File not found

Unsupported file type

PDF parsing fails

Live demo backup plan

Pre-record a screen capture

Pre-warm checklist

Getting help

Related resources