Documentation Index
Fetch the complete documentation index at: https://mintlify.com/holzerjm/civichacks-demo/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The CivicHacks Demo is built on a four-layer stack that runs entirely locally with zero cloud dependencies. Every component is open source and free to use.Architecture diagram
System components
Gradio Web UI
Gradio Web UI
The browser-based chat interface that provides:
- Dynamic header that updates when switching tracks
- Track selector dropdown for all four civic datasets
- Chat interface with message history
- Dynamic example questions per track
- Per-query cost comparison (local vs cloud)
- Red Hat-themed styling
scripts/demo_step3_app.pyLlamaIndex RAG Pipeline
LlamaIndex RAG Pipeline
The retrieval augmented generation layer that:
- Loads documents from the
data/directory - Builds in-memory vector indices for fast search
- Uses HuggingFace embeddings (all-MiniLM-L6-v2, ~80 MB)
- Retrieves relevant chunks before generating responses
- Caches indices for instant track switching
scripts/demo_step2_rag.py, scripts/demo_step3_app.pyOllama + Llama 3.1
Ollama + Llama 3.1
The local LLM inference engine that:
- Runs Llama 3.1 8B model locally (4.7 GB download)
- Provides OpenAI-compatible API on
localhost:11434 - Supports streaming responses for real-time generation
- Works on CPU or GPU (Apple Silicon recommended)
- Returns token counts for cost estimation
Civic Data Files
Civic Data Files
Plain text datasets covering four hackathon tracks:
- EcoHack:
ecohack_boston_environment.txt(air quality, heat islands, climate resilience) - CityHack:
cityhack_boston_311.txt(311 service requests, equity gaps) - EduHack:
eduhack_boston_schools.txt(achievement gaps, absenteeism, tech access) - JusticeHack:
justicehack_ma_justice.txt(incarceration disparities, policing data)
data/ directoryData flow
Here’s what happens when a user asks a question:Every step happens locally on your machine. No data is sent to the cloud, and there are no API keys required.
Hardware profiles
The system adapts to different hardware configurations:| Hardware | Inference Speed | Memory Usage | Notes |
|---|---|---|---|
| Apple Silicon M1/M2/M3/M4 base | 15-25 tok/s | 4-5 GB | Ideal for demos |
| Apple Silicon Pro/Max | 20-35 tok/s | 4-5 GB | Excellent performance |
| x86 laptop (CPU-only) | 3-8 tok/s | 4-5 GB | Works, but slower |
| x86 desktop (discrete GPU) | 15-40 tok/s | 4-5 GB | Fast generation |
System requirements
| Component | Minimum | Recommended |
|---|---|---|
| RAM | 8 GB | 16 GB |
| GPU VRAM | Not required | 8+ GB |
| Storage | 10 GB free | 20 GB free |
| CPU | 4-core | Apple Silicon or recent Intel/AMD |
| Python | 3.10+ | 3.12+ |
Caching strategy
Index caching (Step 3 app)
The Gradio app caches built indices in a global dictionary:Embedding model caching
The HuggingFace embedding model downloads once to~/.cache/huggingface/hub/ and is reused across all scripts.
Ollama model caching
Ollama keeps models in~/.ollama/models/. Once downloaded, models are available offline.
Configuration options
Key settings are defined inscripts/demo_step2_rag.py and scripts/demo_step3_app.py: