The recall endpoint retrieves structured memories from a bank using a multi-strategy pipeline. When you call recall, Hindsight runs four retrieval strategies in parallel — semantic similarity, keyword (BM25), graph traversal, and temporal — fuses their rankings using Reciprocal Rank Fusion (RRF), then re-scores the merged candidates with a cross-encoder reranker. The response contains structured facts in relevance order, not raw documents.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/vectorize-io/hindsight/llms.txt
Use this file to discover all available pages before exploring further.
To learn about the four retrieval strategies and RRF fusion in depth, see the Recall Architecture guide.
Endpoint
Request parameters
Your tenant identifier. Use
default for single-tenant deployments.The memory bank to search.
The natural language question or statement to search for. Drives all four retrieval strategies simultaneously: embedded for semantic search, tokenized for BM25 keyword search, used to seed graph traversal, and parsed for temporal expressions. Also passed to the cross-encoder reranker. Queries exceeding 500 tokens are rejected.
Filters which categories of memory facts are searched. Accepted values:
world (objective facts), experience (events and conversations), observation (deduplicated, evidence-grounded beliefs consolidated from multiple memories). When omitted, all three types are searched.Controls retrieval depth and breadth. Accepted values:
low (fast simple lookups), mid (balanced everyday queries), high (exhaustive coverage for complex questions requiring indirect connections).Maximum number of tokens the returned facts can collectively occupy. Only the
text field of each fact is counted. After reranking, facts are included in relevance order until this budget is exhausted. Set this to however much of your context window you want to allocate to memories.An ISO 8601 datetime representing when the query is being asked. Used as the anchor for resolving relative temporal expressions in the query (e.g. “last month”). Without it, the server’s current time is used. Most useful for replaying historical conversations or building time-anchored recall.
Filters recall to only memories matching the specified tags. Applied at the database level across all four retrieval strategies, not as a post-processing step.
Controls how tag filtering is applied. Accepted values:
| Mode | Untagged memories | Match condition |
|---|---|---|
any | Included | Memory has at least one of the specified tags |
any_strict | Excluded | Memory has at least one of the specified tags |
all | Included | Memory has all of the specified tags |
all_strict | Excluded | Memory has all of the specified tags |
Compound boolean tag filters. Groups in the list are AND-ed together at the top level. Each group is a recursive boolean expression: a leaf node
{"tags": [...], "match": "..."}, or a compound node {"and": [...]}, {"or": [...]}, or {"not": ...}. Can be combined with tags and tags_match — they are AND-ed together.Controls optional supplementary data returned alongside the main facts.
When
true, the response includes a detailed debug trace covering the query embedding, per-strategy retrieval results, RRF fusion candidates, reranked results, temporal constraints detected, and per-phase timings. Has no effect on retrieval logic.Response fields
The main list of recalled facts, ordered by relevance. Results do not include a numeric score — what matters is the relative ordering, already reflected in list order.
A dict keyed by fact ID containing full result objects for the source facts that contributed to observation results. Only present when
include.source_facts is enabled. Facts are deduplicated across observations.A dict keyed by chunk ID containing raw source text chunks. Only present when
include.chunks is enabled. Each chunk has id, text, chunk_index, and truncated (whether the text was cut to fit the token budget).A dict keyed by canonical entity name containing entity state objects. Only present when
include.entities is enabled. Each entry has entity_id, canonical_name, and observations.A debug object present only when
trace: true was set in the request. Contains per-phase timings, retrieval breakdowns, and RRF fusion details.Examples
Basic recall
Recall with budget levels
Tag-scoped recall
Recall observations with source facts
Error codes
| Status | Code | Description |
|---|---|---|
400 | invalid_request | Malformed request body or query exceeds 500 tokens. |
401 | unauthorized | Missing or invalid API key. |
404 | bank_not_found | The specified bank does not exist. |
422 | validation_error | One or more parameters failed validation. |
429 | rate_limited | Too many requests. Retry with exponential backoff. |
500 | internal_error | Server error during retrieval. |
