Documentation Index
Fetch the complete documentation index at: https://mintlify.com/holzerjm/civichacks-demo/llms.txt
Use this file to discover all available pages before exploring further.
Overview
This step lets you plug in any data file and start querying it with AI instantly. You can load a single file or multiple files at once for cross-file exploration — no code changes, no configuration. Duration: ~3-5 minutes (interactive) What you’ll do:- Drop files into the
userdata/directory or provide a file path - Analyze file metadata (type, size, word count, content preview)
- Build a vector index and get an AI summary
- Ask questions about your data in an interactive Q&A loop
- See cost comparison on every query
Prerequisites
Complete Step 2: RAG with Civic Data first, then install the file reader:Supported file types
| Extension | Type | Notes |
|---|---|---|
.txt | Plain text | Simplest option, works everywhere |
.pdf | PDF document | Requires llama-index-readers-file (already installed) |
.csv | CSV spreadsheet | Read as text content |
.docx | Word document | Requires llama-index-readers-file |
For PDFs: Text-based PDFs work. Image-only or encrypted PDFs may not extract text successfully.
Running the script
Auto-discovery from userdata/
Drop files into theuserdata/ directory before running:
- 0 files found: Prompt for a file path
- 1 file found: Automatically use that file
- 2+ files found: Show a numbered list to pick from (or type
ato load all)
Load all files at once
userdata/ into a single combined index, enabling cross-document questions like:
- “Compare the findings across these reports”
- “What themes are common across all the data?”
- “Which document discusses budget constraints?”
With a specific file path
Use a different model
Command-line options
| Option | Default | Description |
|---|---|---|
file | (auto-discover) | Path to data file (positional, optional) |
--all | off | Load ALL files in userdata/ into a single index for cross-file exploration |
--model | llama3.1 | Ollama model to use (lets you try different models) |
Expected output
Interactive commands
| Command | Action |
|---|---|
| (any question) | Query the AI about your data |
summary | Re-generate the AI summary |
help | Show available commands |
quit / exit / q | End the session |
How it works
The script performs these steps:File discovery and validation
find_userdata_files()scans theuserdata/directory for supported file typesvalidate_file()resolves the path, checks extension and file size, handles drag-and-drop quote stripping- Displays file metadata (type, size, modified date)
Load documents
Uses LlamaIndex’s For PDFs, this extracts text content. For CSVs, reads as plain text.
SimpleDirectoryReader to load the file:Generate AI summary
Queries the index with a summary prompt:For multiple files, uses a
MULTI_SUMMARY_PROMPT variant.Cross-file exploration with —all
The--all flag loads every file into a single combined index:
- “What changed between the 2024 and 2025 reports?”
- “Which document discusses staffing shortages?”
- “What themes appear across all three files?”
File size limits
Troubleshooting
Error: Could not extract text from PDF
Error: Could not extract text from PDF
The PDF may be:
- Image-based (scanned document) — use OCR first
- Encrypted/password-protected — remove protection first
- Corrupted — try re-downloading or exporting to a new PDF
Error: File is empty
Error: File is empty
The file has 0 bytes. Check that the file actually contains content.
Error: Unsupported file type
Error: Unsupported file type
Only
.txt, .pdf, .csv, .docx are supported. Convert other formats to one of these first.No files found in userdata/
No files found in userdata/
Create the
userdata/ directory and drop files there:Response doesn't match file content
Response doesn't match file content
Increase
similarity_top_k to retrieve more chunks:Real-world use cases
Budget analysis
Load city budget PDFs and ask:
- “What are the biggest line items?”
- “How does this year compare to last year?”
- “Which departments saw cuts?”
Meeting notes
Load DOCX meeting notes and ask:
- “What action items were assigned?”
- “What decisions were made?”
- “Who attended and what were the key topics?”
Data reports
Load CSV or TXT data files and ask:
- “What are the key trends?”
- “Which metrics are concerning?”
- “What correlations exist?”
Research papers
Load academic PDFs and ask:
- “What is the main finding?”
- “What methodology was used?”
- “What are the limitations?”