Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/holzerjm/civichacks-demo/llms.txt

Use this file to discover all available pages before exploring further.

Get running in 10 minutes

Follow along during the workshop or use this as a starting point for your hackathon project. You can have everything running in about 10 minutes.
At the hackathon? If wifi is slow, pair up with someone who already has the model downloaded. Ollama only needs to download once—after that everything runs offline.

Prerequisites

Before you begin, make sure you have:
  • Python 3.10 or higher
  • At least 8 GB of RAM (16 GB recommended)
  • 10 GB of free disk space
  • Reliable internet connection for initial download (~4.7 GB model)
1

Clone the repository

git clone https://github.com/holzerjm/civichacks-demo.git
cd civichacks-demo
2

Install Ollama and pull the model

Choose your platform:
brew install ollama
Then pull the Llama 3.1 model:
# This downloads ~4.7 GB — use the venue wifi
ollama pull llama3.1
The model downloads once and runs offline forever. If Ollama isn’t running as a background service, start it with ollama serve.
3

Set up Python and install dependencies

Create a virtual environment and install the required packages:
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
This installs:
  • llama-index — RAG framework for connecting AI to data
  • llama-index-llms-ollama — Ollama integration
  • llama-index-embeddings-huggingface — Local embeddings (no API key needed)
  • llama-index-readers-file — File readers for PDF, DOCX, and other formats
  • gradio — Web UI framework
4

Run the demo steps

Now you’re ready to run the demo!

Step 1: Local AI in your terminal

python scripts/demo_step1_ollama.py
You’ll see the AI generate a response in real-time, followed by cost comparison:
⏱️  12.3s · 142 tokens · 11 tok/s
⚡ Local: $0.000009 (0.051 Wh @ 15W) · GPT-4o: $0.0017 (189x more)

Step 2: Connect AI to civic data

Pick a track: eco, city, edu, or justice
python scripts/demo_step2_rag.py city
The AI will analyze real Boston 311 service data and answer questions about it.

Step 3: Launch the web app

python scripts/demo_step3_app.py
Opens at http://localhost:7860 with a full chat interface, track selector, and example questions.

Step 4: Bring your own data (interactive)

Drop any file into the userdata/ directory, then:
# Auto-discover from userdata/
python scripts/demo_step4_byod.py

# Load ALL files in userdata/
python scripts/demo_step4_byod.py --all

# Use a specific file
python scripts/demo_step4_byod.py path/to/your/file.txt
Supports .txt, .pdf, .csv, and .docx files.

Step 5: BYOD as a web app

python scripts/demo_step5_byod_app.py
Opens at http://localhost:8861 with drag-and-drop file upload in the browser.

What you just built

You now have a working civic AI application that:
  • Runs a GPT-4-class model locally for free
  • Analyzes real civic datasets using RAG
  • Provides a shareable web interface
  • Works with any data you provide

Swap the data

Drop your own .txt, .pdf, or .csv files into userdata/ and run Step 4

Change the model

Try ollama pull llama3.2:3b for faster responses or ollama pull deepseek-r1:7b for stronger reasoning

Make it yours

Fork the repo, change the prompts, add new tracks, build a hackathon project on top of it

Get help

Every script supports --help for full usage details

Understanding the output

Each demo step shows you:
OutputMeaning
12.3sTime to generate the response
142 tokensNumber of tokens generated
11 tok/sGeneration speed (Apple Silicon: 15-25, CPU-only: 3-5)
Local: $0.000009Actual electricity cost at 15W
GPT-4o: $0.0017What the same query would cost on GPT-4o API
189x moreCost multiplier for cloud vs. local
First run is slower because models need to load into memory and the embedding model (~80 MB) downloads on first use. Subsequent runs use the cache.

Try different models

Swap models to match your hardware:
# Smaller/faster model for limited hardware
ollama pull phi3:mini          # 3.8B, runs on almost anything
ollama pull llama3.2:3b        # 3B, very fast

# Larger/better model if you have the RAM
ollama pull llama3.1:70b       # Needs ~40GB RAM, incredible quality

# Best for reasoning tasks
ollama pull deepseek-r1:7b     # Strong reasoning, MIT license
Then use --model flag with any script:
python scripts/demo_step4_byod.py --model phi3:mini
python scripts/demo_step5_byod_app.py --model deepseek-r1:7b

Command-line options

All scripts support --help to see available options:
python scripts/demo_step1_ollama.py --help
python scripts/demo_step2_rag.py --help
python scripts/demo_step3_app.py --help
python scripts/demo_step4_byod.py --help
python scripts/demo_step5_byod_app.py --help

Common options

ScriptOptions
demo_step2_rag.py<track> (eco/city/edu/justice), <question_num> (1-3), --all
demo_step3_app.py--port <port>, --share
demo_step4_byod.py<file_path>, --all, --model <model_name>
demo_step5_byod_app.py--port <port>, --model <model_name>, --share

Troubleshooting

Check if Ollama is running:
ollama list
If not, start it:
ollama serve
  • CPU inference for 8B models: expect 3-8 tokens/second (still fine for demos)
  • Make sure no other heavy applications are competing for memory
  • Try a smaller model: ollama pull llama3.2:3b
The HuggingFace embedding model (all-MiniLM-L6-v2) downloads on first use (~80 MB). Run Step 2 at least once to cache it before presenting.
Try specifying the browser:
BROWSER=chrome python scripts/demo_step3_app.py
Or open manually at http://localhost:7860

Next steps

Read the installation guide

Detailed setup instructions and hardware requirements

Understand the architecture

Learn how the RAG pipeline works

Explore the datasets

See what civic data is included

Customize your app

Change prompts, add tracks, and deploy

Build docs developers (and LLMs) love