The search service exposes the query interface of the GuancheData platform. It joins the Hazelcast cluster as a full member (not a lightweight client), giving it direct local access to index data. Incoming queries are tokenized, fanned out across all query terms in parallel usingDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/GuancheData/stage_3/llms.txt
Use this file to discover all available pages before exploring further.
CompletableFuture, and then intersected to enforce AND semantics. Results are enriched with book metadata, filtered by optional author, language, and year criteria, and finally sorted before being returned as JSON.
Query execution flow
Parse the HTTP request
SearchRequestMapper reads the q query parameter (required) and the optional author, language, and year parameters from the GET /search request, producing a SearchCriteria object. A missing or blank q returns HTTP 400.Tokenize the query
ContentSearchEngine lowercases the query string, strips all non-alphanumeric characters, and splits on whitespace to produce an array of terms.Fan out parallel index lookups
One
CompletableFuture is submitted to a fixed thread pool (sized to availableProcessors - 3) for each query term. Each future calls IndexStore.getDocuments(term) against the "inverted-index" IMap and parses the docId:frequency entries it receives.Aggregate and intersect
As futures complete, per-document frequencies are summed into a
ConcurrentHashMap. A second map tracks how many query terms matched each document. After CompletableFuture.allOf() returns, documents that did not match every query term are removed, enforcing AND semantics.Apply metadata filters
FindBooks fetches BookMetadata for all surviving document IDs from the "bookMetadata" IMap, then discards any document where author, language, or year does not match the filter values supplied in the request.Sorting strategies
The active sorting strategy is selected at startup from theSORTING_CRITERIA environment variable (default frequency).
| Value | Strategy | Behavior |
|---|---|---|
frequency | SortByFrequency | Sorts results descending by summed term frequency across all query terms. Books containing query terms more often appear first. |
id | SortById | Sorts results ascending by Gutenberg book ID. |
SORTING_CRITERIA is set to an unrecognised value, SortByFrequency is used as the fallback.
Near-cache
The
"inverted-index" IMap is configured with a near-cache named "inverted-index-near-cache" with invalidateOnChange: true. Frequently accessed index entries are served from a local in-process cache, avoiding a network hop to the partition owner. When an indexer node updates a term’s entry in the distributed map, the near-cache entry on all search nodes is automatically invalidated, so reads never return stale data.Response shape
A successful search response has the following structure:frequency is the sum of per-term frequencies across all terms in the query for that document. filters only contains the keys that were supplied in the request. On error the response is {"status": "error", "message": "..."}.
HTTP endpoints
| Method | Path | Description |
|---|---|---|
GET | /search?q=... | Full-text search. Optional params: author, language, year. |
GET | /health | Returns {"status": "healthy", "service": "execute"}. |