Deployment options

Overview

The CivicHacks Demo supports three deployment modes:

Local only (default) — Runs on localhost, no data leaves your machine
Gradio share — Temporary public URL via tunneling (72-hour limit)
Hugging Face Spaces — Free cloud hosting (requires model swap)

Local deployment is recommended for hackathons and demos. It’s private, fast, and requires zero configuration.

Local deployment (default)

The app runs entirely on your machine. No data is sent to the cloud, and no API keys are required.

Step 3: Full web app

# Start the app
python scripts/demo_step3_app.py

# Opens at http://localhost:7860

Access points:

Same machine: http://localhost:7860
Other devices on same network: http://YOUR_IP:7860

Find your local IP address

macOS/Linux:

# Get your local IP
ipconfig getifaddr en0    # WiFi
ipconfig getifaddr en1    # Ethernet

Windows:

ipconfig | findstr IPv4

Then share http://YOUR_IP:7860 with others on your network.

Step 5: BYOD web app

# Start the BYOD app
python scripts/demo_step5_byod_app.py

# Opens at http://localhost:8861

Different port: Step 5 uses port 8861 to avoid conflicts with Step 3.

Custom port

Change the port if needed:

# Step 3 on custom port
python scripts/demo_step3_app.py --port 8080

# Step 5 on custom port
python scripts/demo_step5_byod_app.py --port 9000

Server configuration

Both apps bind to 0.0.0.0 by default, making them accessible on your local network:

app.launch(
    server_name="0.0.0.0",  # Listen on all interfaces
    server_port=7860,
    # ...
)

If deploying on a public network, be aware that anyone on the network can access your app. Use a firewall or restrict server_name to "127.0.0.1" for localhost-only access.

Gradio can create a temporary public URL that tunnels to your local machine. Useful for:

Sharing with remote judges
Quick demos to people not on your network
Testing mobile access

How to enable

# Step 3 with share link
python scripts/demo_step3_app.py --share

# Step 5 with share link
python scripts/demo_step5_byod_app.py --share

Output:

Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://abc123xyz.gradio.live

This share link expires in 72 hours.

The share link works for 72 hours or until you stop the script, whichever comes first.

How it works

┌──────────────┐      Internet       ┌──────────────┐
│   Browser    │ ←──────────────────→ │    Gradio    │
│  (anywhere)  │   HTTPS tunnel       │   Servers    │
└──────────────┘                      └──────┬───────┘
                                             │
                                             ▼
                                    ┌─────────────────┐
                                    │  Your Machine   │
                                    │  localhost:7860 │
                                    └─────────────────┘

Gradio creates a secure tunnel from their servers to your local app.

Data passes through Gradio’s servers for routing, but the LLM inference still happens locally on your machine. Your civic data files never leave your machine.

Share link creation fails

Share link is slow

Share link stopped working

Hugging Face Spaces (free cloud hosting)

Deploy the web app to Hugging Face Spaces for permanent, free hosting.

Important limitation: Hugging Face Spaces doesn’t support Ollama (which runs locally). You must swap to a cloud-based model API or use HF Inference API.

Prerequisites

Create a free account at huggingface.co
Create a new Space:
- Click “New” → “Space”
- Choose “Gradio” as the SDK
- Select “Public” or “Private”

Option 1: Use Hugging Face Inference API

Free tier: 1,000 requests/day for most models. Update your script to use HF Inference API instead of Ollama:

from llama_index.llms.huggingface import HuggingFaceInferenceAPI
from llama_index.core import Settings
import os

# Use HF Inference API
Settings.llm = HuggingFaceInferenceAPI(
    model_name="meta-llama/Meta-Llama-3-8B-Instruct",
    token=os.getenv("HF_TOKEN"),  # Add this as a Space secret
)

# Rest of your code stays the same...

Add your HF token as a Space secret:

Go to your Space settings
Add a new secret: HF_TOKEN = your token from huggingface.co/settings/tokens

Option 2: Use OpenAI API

Cost: ~$0.002 per query with GPT-4o-mini.

from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
import os

Settings.llm = OpenAI(
    model="gpt-4o-mini",
    api_key=os.getenv("OPENAI_API_KEY"),  # Add as Space secret
)

Deploy to Spaces

Via Git (recommended)

# Clone your Space repo
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME

# Copy your files
cp ../civichacks-demo/scripts/demo_step3_app.py app.py
cp ../civichacks-demo/data/* .
cp ../civichacks-demo/requirements.txt .

# Update requirements.txt
# Remove: llama-index-llms-ollama
# Add: llama-index-llms-huggingface (or openai)

# Commit and push
git add .
git commit -m "Initial deployment"
git push

Your Space will rebuild automatically.

Via web interface

Go to your Space page
Click “Files” tab
Upload:
- demo_step3_app.py (rename to app.py)
- requirements.txt (updated)
- All files from data/ directory
The Space will build automatically

Required file structure

Your-Space/
├── app.py                  # Renamed from demo_step3_app.py
├── requirements.txt        # Updated dependencies
├── README.md              # Space description
└── data/                  # Your civic datasets
    ├── ecohack_boston_environment.txt
    ├── cityhack_boston_311.txt
    ├── eduhack_boston_schools.txt
    └── justicehack_ma_justice.txt

Update requirements.txt for Spaces

Remove Ollama-specific dependencies:

llama-index>=0.12.0
llama-index-embeddings-huggingface>=0.5.0
llama-index-readers-file>=0.4.0
gradio>=5.0.0

# Choose ONE:
llama-index-llms-huggingface>=0.5.0  # For HF Inference API
# OR
llama-index-llms-openai>=0.3.0       # For OpenAI

Hugging Face Spaces has generous free tier limits. For a hackathon demo, HF Inference API is usually sufficient.

Streamlit Cloud (alternative)

Streamlit Cloud is another free hosting option. You’ll need to convert the Gradio app to Streamlit.

Conversion example

Gradio (current):

import gradio as gr

def query_fn(question, track):
    # ...
    return response

with gr.Blocks() as app:
    track = gr.Dropdown(["eco", "city", "edu", "justice"])
    question = gr.Textbox()
    output = gr.Textbox()
    question.submit(query_fn, [question, track], output)

app.launch()

Streamlit (converted):

import streamlit as st

st.title("CivicHacks AI Assistant")

track = st.selectbox("Select track", ["eco", "city", "edu", "justice"])
question = st.text_input("Ask a question")

if question:
    response = query_fn(question, track)
    st.write(response)

Deploy to Streamlit Cloud

Push code to GitHub
Go to share.streamlit.io
Connect your GitHub repo
Select the branch and main file (app.py)
Deploy

Streamlit Cloud also requires cloud-based LLM APIs (no Ollama support).

Docker deployment (advanced)

For self-hosted deployment on a server, use Docker.

Dockerfile example

FROM python:3.12-slim

# Install Ollama
RUN curl -fsSL https://ollama.com/install.sh | sh

# Install Python dependencies
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application files
COPY scripts/ ./scripts/
COPY data/ ./data/

# Download model
RUN ollama pull llama3.1

# Expose ports
EXPOSE 7860 11434

# Start Ollama and the app
CMD ["sh", "-c", "ollama serve & python scripts/demo_step3_app.py"]

Build and run

# Build image
docker build -t civichacks-demo .

# Run container
docker run -p 7860:7860 -p 11434:11434 civichacks-demo

Docker deployment is complex and requires managing Ollama inside the container. Only recommended for experienced users.

Performance comparison

Deployment	Speed	Privacy	Cost	Setup
Local	Fast (local hardware)	100% private	Free (electricity)	Easy
Gradio share	Fast (local hardware)	Data routed via Gradio	Free (electricity)	Easy
HF Spaces	Slow (shared CPU)	Data sent to HF/OpenAI	Free tier or API costs	Medium
Streamlit Cloud	Slow (shared CPU)	Data sent to cloud	Free tier or API costs	Medium
Docker	Depends on server	Depends on hosting	Server costs	Hard

For hackathons and live demos, stick with local deployment. It’s faster, more private, and doesn’t require API keys or internet.

Network security considerations

Local network access

When running on 0.0.0.0, your app is accessible to anyone on your network. Restrict to localhost only:

app.launch(
    server_name="127.0.0.1",  # Only accessible on same machine
    server_port=7860,
)

Firewall rules

If you want network access but with restrictions, configure your firewall: macOS:

# Allow connections from specific IP range
sudo pfctl -e
sudo pfctl -f /etc/pf.conf

Linux (ufw):

# Allow from local network only
sudo ufw allow from 192.168.1.0/24 to any port 7860

Authentication

Gradio supports basic authentication:

app.launch(
    auth=("username", "password"),  # Simple auth
    # or
    auth=lambda u, p: u == "admin" and p == "secret",  # Custom logic
)

Basic auth is not secure over HTTP. Use HTTPS if deploying on public networks.

Getting Started

Tutorial Steps

Civic Data

Customization

Reference

Overview

Local deployment (default)

Step 3: Full web app

Step 5: BYOD web app

Custom port

Server configuration

How to enable

How it works

Hugging Face Spaces (free cloud hosting)

Prerequisites

Option 1: Use Hugging Face Inference API

Option 2: Use OpenAI API

Deploy to Spaces

Required file structure

Update requirements.txt for Spaces

Streamlit Cloud (alternative)

Conversion example

Deploy to Streamlit Cloud

Docker deployment (advanced)

Dockerfile example

Build and run

Performance comparison

Network security considerations

Local network access

Firewall rules

Authentication

Build docs developers (and LLMs) love

Getting Started

Tutorial Steps

Civic Data

Customization

Reference

Documentation Index

​Overview

​Local deployment (default)

​Step 3: Full web app

​Step 5: BYOD web app

​Custom port

​Server configuration

​Gradio share (temporary public URL)

​How to enable

​How it works

​Troubleshooting share links

​Hugging Face Spaces (free cloud hosting)

​Prerequisites

​Option 1: Use Hugging Face Inference API

​Option 2: Use OpenAI API

​Deploy to Spaces

​Required file structure

​Update requirements.txt for Spaces

​Streamlit Cloud (alternative)

​Conversion example

​Deploy to Streamlit Cloud

​Docker deployment (advanced)

​Dockerfile example

​Build and run

​Performance comparison

​Network security considerations

​Local network access

​Firewall rules

​Authentication

​Related resources

Build docs developers (and LLMs) love

Overview

Local deployment (default)

Step 3: Full web app

Step 5: BYOD web app

Custom port

Server configuration

Gradio share (temporary public URL)

How to enable

How it works

Troubleshooting share links

Hugging Face Spaces (free cloud hosting)

Prerequisites

Option 1: Use Hugging Face Inference API

Option 2: Use OpenAI API

Deploy to Spaces

Required file structure

Update requirements.txt for Spaces

Streamlit Cloud (alternative)

Conversion example

Deploy to Streamlit Cloud

Docker deployment (advanced)

Dockerfile example

Build and run

Performance comparison

Network security considerations

Local network access

Firewall rules

Authentication

Related resources