Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/TangibleResearch/Halgorithem/llms.txt

Use this file to discover all available pages before exploring further.

This page covers every requirement and setup step for running Halgorithem. The library is currently distributed as source code, so installation means cloning the repository and installing Python dependencies manually. A virtual environment is strongly recommended to avoid conflicts with other projects.

Requirements

  • Python 3.8 or higher — Halgorithem uses type annotations and f-strings that require Python 3.8+.
  • Virtual environment — isolate the project’s dependencies from your system Python.
  • git — required to clone the repository.
The requirements.txt file in the repository does not list all runtime dependencies. The full set of packages needed — including sentence-transformers, negspacy, quantulum3, sympy, markdown-it-py, and nltk — must be installed separately. The steps below cover the complete install. Additionally, Halgorithem/nlp.py loads en_core_web_lg (the large spaCy model) at runtime, not en_core_web_sm as shown in the README. Download the large model to avoid a load error.

Installation steps

1

Clone the repository

git clone https://github.com/TangibleResearch/Halgorithem.git
cd Halgorithem
2

Create and activate a virtual environment

python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
3

Install dependencies

# Install packages listed in requirements.txt
pip install -r requirements.txt

# Install packages used by the library but missing from requirements.txt
pip install sentence-transformers negspacy quantulum3 sympy markdown-it-py nltk
4

Download the spaCy language model

The library loads en_core_web_lg at import time. Download it before running any verification:
python -m spacy download en_core_web_lg
5

Download the NLTK WordNet corpus

WordNet is used for synonym expansion during claim scoring. Download it once from a Python session:
import nltk
nltk.download("wordnet")

What each key package does

PackageRole in Halgorithem
spacy + en_core_web_lgTokenization, POS tagging, named entity recognition, and dependency parsing for claim filtering
sentence-transformersGenerates semantic embeddings (all-MiniLM-L6-v2) used to score claims against truth document chunks
pysbdSentence boundary detection — splits AI output into individual claims
nltk + WordNetSynonym expansion: lets a claim token match source tokens with the same meaning
negspacyNegation detection — flags claims where the AI inverts the meaning of a source statement
quantulum3Extracts numbers and quantities from text (handles “seven billion”, “$4.2B”, ordinals)
sympyEvaluates and verifies math claims such as 2 + 2 = 4
textacyUnicode normalization and text preprocessing
clean-textStrips URLs, emails, and non-ASCII characters from raw text
markdown-it-pyConverts Markdown-formatted AI output to plain text before processing
scikit-learnProvides the English stop words list used during tokenization
beautifulsoup4, html2text, requestsWeb scraping: fetch and convert HTML pages to plain text for use as truth sources
flaskPowers the optional web server interface
openaiUsed only by the Engine class to generate AI responses — not needed for standalone verification

Verify the install

Run the benchmark to confirm everything is working:
python bench.py
A passing run prints an accuracy score at the end:
================================================================================
Accuracy: 100.0%
================================================================================
If spaCy or a missing package causes an import error, re-check that you completed all five installation steps above.

Build docs developers (and LLMs) love