TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/TangibleResearch/Halgorithem/llms.txt
Use this file to discover all available pages before exploring further.
Engine class in engine.py provides a high-level pipeline that combines web scraping, AI text generation via OpenAI, and claim verification. It also exposes three module-level convenience functions that delegate to a shared _engine instance so you can use the module directly without instantiating Engine yourself.
Engine and the module-level functions require a valid OPENAI_API_KEY environment variable. Set it before calling generate or run.To change the default OpenAI model, set the
OPENAI_MODEL environment variable before importing the module. If the variable is not set, the model defaults to "gpt-4o".Constructor
OpenAI model name to use for generation. Defaults to the value of the
OPENAI_MODEL environment variable, falling back to "gpt-4o" if the variable is not set.Number of sentences per chunk, passed directly to the underlying
Halgorithm instance.Number of sentences to overlap between consecutive chunks, passed to
Halgorithm.Stored on the instance but not currently forwarded to
WebScraper. Web scraping uses a hardcoded 5-second timeout per request.Methods
run
ValueError if neither urls nor truth_file_paths produce any source documents.
The question or instruction to send to the OpenAI model.
List of URLs to scrape as truth sources. Pages are scraped to plain text before being passed to the model and verifier.
List of local file paths to load as truth sources. Can be combined with
urls.Minimum cosine similarity score passed through to
Halgorithm.compare_to_docs.List of claim result dicts — one per extracted claim. See claim result object reference.
Human-readable summary string, e.g.
"3/4 supported, 1/4 weak, 0/4 contradictions, 0/4 hallucinations".The raw text generated by the OpenAI model.
List of file paths and/or URLs that were used as truth sources.
generate
prompt to the OpenAI model, optionally grounded by source_docs. When source documents are provided, the model is instructed not to add facts beyond what is present in those documents.
The question or instruction to send to the model.
List of source document dicts with
file_path and text keys — the format returned by scrape_urls() or load_truth_files(). When omitted, the model answers from its own training data.str containing the model’s response text.
verify
The AI-generated text to verify.
List of source document dicts (with
file_id, file_path, text keys) to verify against.Minimum cosine similarity score to avoid a
HALLUCINATION classification.claims (list of claim result dicts) and summary (str).
scrape_urls
WebScraper and returns them as source document dicts. Pages are written to a temporary directory during scraping and cleaned up automatically.
URLs to scrape. A warning is printed for any URL that fails to scrape; failed URLs are omitted from the result.
1-indexed position of the URL in the input list.
The original URL string.
Plain text content scraped from the page.
load_truth_files
Halgorithm.load_files(). Returns the same format as scrape_urls so that both can be combined as truth sources.
File paths to load.
file_id, file_path, and text keys.
Module-level functions
The module exposes three top-level convenience functions that use a sharedEngine() instance created at import time. You do not need to instantiate Engine to use them.
engine.run
Engine.run() on the shared instance. Raises ValueError if no sources are provided.
The question or instruction for the model.
URLs to scrape as truth sources.
Local file paths to load as truth sources.
Minimum cosine similarity for claim classification.
Engine.run().
engine.generate
urls and/or truth_file_paths, then generates a grounded response. If no sources are provided, the model answers from its own training data.
The question or instruction for the model.
URLs to scrape as grounding context.
Local file paths to use as grounding context.
str with the model’s response.
engine.verify
ai_output against them. Raises ValueError if no sources are provided.
The AI-generated text to verify.
URLs to scrape as truth sources.
Local file paths to use as truth sources.
Minimum cosine similarity for claim classification.
claims and summary.