Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/holzerjm/civichacks-demo/llms.txt

Use this file to discover all available pages before exploring further.

The JusticeHack dataset (justicehack_ma_justice.txt) is a synthetic criminal justice reform progress report for Massachusetts, 2025. It contains detailed metrics on incarceration rates, racial disparities, pretrial detention, recidivism, policing data, juvenile justice, and access to legal representation.

Dataset overview

File: data/justicehack_ma_justice.txt
Scope: Massachusetts Criminal Justice Reform Progress Report, 2025
Size: ~31 lines, 14,500+ characters
Format: Plain text, structured report

Data sections

Prison and jail populations

  • State prison population: 6,847 (down 18% from 2019)
  • County jail population: 4,291 (down 12% from 2019)
  • Incarceration rate: 152 per 100,000 (lowest in US, national avg: 350)
Racial disparities persist despite 2018 reforms:
  • Black residents incarcerated at 7.9x the rate of White residents
  • Hispanic/Latino residents at 4.2x the rate of White residents
  • No improvement since 2018 Criminal Justice Reform Act

Sample queries

Here are the three built-in queries for the JusticeHack track:
python scripts/demo_step2_rag.py justice 1

Query 1: What racial disparities exist in pretrial detention in Massachusetts?

This query surfaces:
  • Black: 38% of pretrial detainees (population: ~9%)
  • Hispanic/Latino: 31% of pretrial detainees
  • White: 27% of pretrial detainees (population: ~70%)
  • 62% held on bail ≤$5,000
  • Average detention: 94 days
  • 71% of low-level detainees in Suffolk County ultimately released without further incarceration
  • Pretrial detention increases guilty plea likelihood 2.5x

Query 2: How effective are reentry programs at reducing recidivism?

This query identifies:
  • Overall 3-year recidivism: 32%
  • Prison education participants: 28% lower recidivism
  • Employment within 30 days: 47% lower recidivism
  • 68% report employment difficulties
  • 41% face housing instability within 6 months
  • Geographic disparity: Suffolk 14 programs, Berkshire 2
  • $28M budget serves only 35% of those released annually

Query 3: What does the data reveal about policing patterns in Boston?

This query highlights:
  • 14,827 FIO stops in 2025
  • Black individuals: 63% of stops (23% of population)
  • 89% of stops result in no action
  • Most common reason: “suspicious behavior” (47%)
  • Use of force: 72% involve Black/Hispanic individuals
  • 94% body camera compliance
  • Internal affairs: 7.5% complaint sustain rate, 278-day avg resolution

Key metrics reference

MetricValueContext
State prison population6,847Down 18% from 2019
County jail population4,291Down 12% from 2019
MA incarceration rate152 per 100KLowest in US (national: 350)
Black incarceration rate7.9x WhiteNo improvement since 2018 reforms
Hispanic/Latino rate4.2x WhiteNo improvement since 2018 reforms
MetricValueContext
Daily pretrial population3,41248% of total jail population
Average detention length94 days
Held on bail ≤$5K62%
Cost per person per day$182Suffolk County
Black detainees38%
Hispanic/Latino detainees31%
White detainees27%
Low-level → no further incarceration71%Suffolk County
Pretrial → guilty plea multiplier2.5xvs. released defendants
Cash bail usage-23%Since 2018 reforms
Dangerousness hearings+45%Alternative detention path
FactorRecidivism rateImpact
Overall 3-year32%Baseline
Property crimes41%+9 pts
Drug offenses38%+6 pts
Violent offenses24%-8 pts
Prison education programs-28%
Employment within 30 days-47%
Barriers:
  • 68% difficulty finding employment
  • 41% housing instability within 6 months
  • FY2025 budget serves only 35% of releases
MetricValuePopulation %
Total FIO stops14,827
Black individuals stopped63%23% of Boston
Hispanic/Latino stopped18%
White individuals stopped15%
Stops with no action89%
Reason: “suspicious behavior”47%
Use of force incidents387Down 8% YoY
UoF involving Black/Hispanic72%
Body camera compliance94%
MetricValueContext
DYS commitments186Down 42% from 2019
Average commitment length14.2 months
Black youth41%
Hispanic/Latino youth33%
White youth22%
Mental health diagnosis73%Only 48% received treatment
Community alternatives-19% re-offensevs. facility commitments
Suspended/expelled → justice contact3.1xWithin 2 years
Black BPS suspension rate4x WhiteSchool-to-prison pipeline

Using this dataset in the web app

When you select JusticeHack in the Gradio app (Step 3), the interface displays:
  • Header: “⚖️ JusticeHack — Analyze Massachusetts criminal justice reform, incarceration disparities, and policing data”
  • Example questions: All three queries above as clickable examples
  • Chat responses: AI answers grounded in the specific metrics from this dataset
Try asking follow-up questions like “Why is the pretrial detention rate so high?” or “What explains the school-to-prison pipeline?” to explore the data interactively.

Querying from code

Here’s how the RAG pipeline loads and queries this dataset:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# Configure local AI
Settings.llm = Ollama(model="llama3.1", request_timeout=120.0)
Settings.embed_model = HuggingFaceEmbedding(model_name="all-MiniLM-L6-v2")

# Load the JusticeHack dataset
data_file = "data/justicehack_ma_justice.txt"
documents = SimpleDirectoryReader(input_files=[data_file]).load_data()

# Build vector index
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(streaming=True, similarity_top_k=3)

# Query
response = query_engine.query(
    "What racial disparities exist in pretrial detention in Massachusetts?"
)
response.print_response_stream()
See Step 2: RAG with civic data for the full implementation.

Build docs developers (and LLMs) love