TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/TangibleResearch/Halgorithem/llms.txt
Use this file to discover all available pages before exploring further.
Halgorithm class is the core of the library — it handles document loading, chunking, claim extraction, and verification. Import it from Halgorithem.
Constructor
Number of sentences to include in each truth document chunk. Larger values give more context per chunk but reduce granularity.
Number of sentences to overlap between consecutive chunks. Overlap helps avoid splitting claims across chunk boundaries.
Methods
compare_to_docs
ai_output against the provided truth documents. This is the primary verification method — it accepts pre-loaded document dicts, raw strings, or lists of strings.
The source-of-truth content to verify against. Accepts:
- A single string (treated as one inline document)
- A list of strings (each treated as a separate inline document)
- A list of dicts with
file_id(int),file_path(str), andtext(str) keys — the format returned byload_files()
The AI-generated text to verify. Each sentence is extracted and checked independently.
Minimum cosine similarity score required to avoid a
HALLUCINATION classification. Claims scoring between threshold and 0.65 are classified as WEAK_SUPPORT; claims scoring 0.65 or above are SUPPORTED.compare_to_files
compare_to_docs. Use this when you have local source files rather than pre-loaded text.
List of file path strings to load as truth documents. Each file is read as UTF-8 text.
The AI-generated text to verify.
Minimum cosine similarity score to avoid a
HALLUCINATION classification.compare_to_docs.
compare_with_reasoning
compare_to_files — identical signature and return value. Provided for API compatibility.
List of file path strings to load as truth documents.
The AI-generated text to verify.
Minimum cosine similarity score to avoid a
HALLUCINATION classification.load_files
compare_to_docs.
List of file path strings to load.
1-indexed position of the document in the list.
The original file path string as provided.
Full UTF-8 text content of the file.
load_file
Path to the file to read. Raises
FileNotFoundError if the path does not exist, and ValueError if the path is not a file.str with the full UTF-8 content of the file.
print_report
List of claim result dicts returned by
compare_to_docs or compare_to_files.None. Output format:
chunk_text
compare_to_docs calls it internally.
The document text to chunk.
Identifier to attach to all chunks from this document.
Human-readable label for the source (e.g. a file path or URL) attached to each chunk.
The
doc_id value passed to chunk_text.The
source_name value passed to chunk_text.1-indexed position of this chunk within the document.
1-indexed position of the first sentence in this chunk.
1-indexed position of the last sentence in this chunk.
The raw text of the chunk.
Lowercased tokens extracted from the chunk text, with punctuation and stop words removed (surface forms, not lemmatized).
Named entities (proper nouns, organizations, locations, etc.) found in the chunk.
Numeric values found in the chunk text.
Sentence embedding tensor produced by
all-MiniLM-L6-v2 via sentence-transformers.split_sentences
The text to split into sentences.
list[str] of cleaned, non-empty sentence strings.
support_score
The claim sentence to score.
A chunk dict containing at minimum an
embedding key (a sentence embedding tensor, as produced by chunk_text).float in the range [-1.0, 1.0] representing the cosine similarity between the claim and the chunk. Higher values indicate stronger semantic alignment.
is_meaningful_claim
The sentence to evaluate.
True if the claim is verifiable, False if it should be skipped.
classify_claim_type
The claim sentence to classify.
"MATH" if the claim contains an arithmetic expression (e.g. 2 + 2 = 4, 50%), or "SOURCE" otherwise.