Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/GuancheData/stage_3/llms.txt

Use this file to discover all available pages before exploring further.

The index-document endpoint lets you trigger indexing of a single book on demand. The indexing service reads the raw book text from the Hazelcast "datalake" distributed map, tokenizes its content, writes the resulting token data into the "inverted-index" map, and saves parsed metadata. Before calling this endpoint, the book must already exist in the datalake — that is, it must have been successfully ingested by the ingestion pipeline.

Path parameters

documentId
integer
required
The integer Gutenberg book ID to index. This must match a key that exists in the Hazelcast "datalake" IMap, populated during ingestion.

Idempotency

The indexing service maintains a distributed "indexingRegistry" ISet that tracks which document IDs have already been processed. If documentId is already present in the registry, the indexing step is skipped and the call returns HTTP 200 without modifying any data. This makes the operation safe to call multiple times for the same document.
The book must exist in the datalake before this endpoint is called. If the document has not been ingested, the request will fail with a 500 error.

Request

curl -X POST http://localhost:7002/index/document/2701

Response

HTTP 200 — success

{
  "status": "success",
  "message": "Document indexed successfully",
  "documentId": 2701
}

HTTP 400 — invalid document ID

Returned when documentId cannot be parsed as an integer.
{
  "status": "error",
  "message": "Invalid ID format. Must be an integer."
}

HTTP 500 — internal error

Returned when indexing fails, for example if the document is not present in the datalake.
{
  "status": "error",
  "message": "Document not found in datalake"
}

Build docs developers (and LLMs) love