A full index rebuild wipes all distributed index state and reconstructs the inverted index from the raw files stored in the datalake filesystem. You would trigger one after recovering from severe data corruption, after restoring a datalake backup that predates the current in-memory state, or after a configuration change that requires re-tokenizing all documents with a different strategy. On normal startup, the indexing service also performs an automatic recovery pass using the same underlying mechanism — no manual trigger is needed for that case.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/GuancheData/stage_3/llms.txt
Use this file to discover all available pages before exploring further.
Automatic startup recovery
When an indexing node starts,ReindexingExecutor.executeRecovery() is called automatically. It invokes InvertedIndexRecovery, which walks the local datalake/ filesystem, reads every {id}_header.txt and {id}_body.txt pair, saves the content back to Hazelcast, and runs the indexing use case for each book. It returns the highest book ID found, which is then used to seed the "books" IQueue so that ingestion resumes from that point forward rather than re-crawling already-stored books.
This recovery runs on each node independently at startup and does not require coordination. It is not a full cluster-wide rebuild — it only indexes the files that are locally present on that node’s datalake volume.
Manual coordinated rebuild
To trigger a full coordinated rebuild across all active indexing nodes, send a POST request to any indexing service:Rebuild flow
Pause ingestion
CoordinateRebuild.execute() counts the number of active cluster members with the role=indexer attribute, then publishes an INGESTION_PAUSE command to the ingestion.control ActiveMQ topic. All ingestion nodes receive this command (via their durable subscribers) and stop emitting new indexing events.Size the coordination latch
A Hazelcast CP
CountDownLatch named "rebuild-latch" is created (or reset) with a count equal to the number of active indexer nodes. This ensures the coordinator waits for every participating node to finish before resuming ingestion.Broadcast the rebuild command
CoordinateRebuild publishes a {"epoch": <timestampMs>} JSON message to the ActiveMQ topic index.rebuild.command. Every indexing node receives this message via its RebuildMessageListener (each subscribes with a unique UUID-based client ID).Each node waits for cluster sync, then clears and re-indexes
On receipt of the
RebuildCommand, each RebuildMessageListener waits 10 seconds for cluster state to stabilize, then calls ReindexingExecutor.rebuildIndex(). That method:- Stops the queue population loop.
- Clears the
"log","indexingRegistry","inverted-index","bookMetadata", and"books"distributed structures. - Resets the
"queueInitialized"CP atomic long to0. - Calls
executeRecovery(), which walks the datalake filesystem and re-indexes every book found.
countDown() on the "rebuild-latch" CP latch.Coordinator waits for all nodes
The coordinator thread (started in step 1) blocks on
latch.await(1, TimeUnit.HOURS). It waits until all indexing nodes have counted down, confirming that the full cluster has finished rebuilding.The coordinator will wait up to 1 hour for all nodes to count down the CP latch. If a node crashes mid-rebuild, it will never count down, and the coordinator will time out after one hour with
REBUILD TIMEOUT. Ingestion remains paused. logged at ERROR level. If this happens, manually resume ingestion by restarting the crashed node (which will re-execute its rebuild on startup and count down the latch), or restart the ingestion service containers directly.How InvertedIndexRecovery reads the datalake
InvertedIndexRecovery walks the entire directory tree rooted at the datalake path. For each file whose name ends with _body.txt, it:
- Extracts the numeric book ID from the filename.
- Resolves the corresponding
{id}_header.txtin the same directory. - Reads both files and saves the
BookContentto Hazelcast viabookStore.save(). - Calls
indexBookUseCase.execute(bookId)to tokenize and write posting-list entries into theinverted-indexIMap. - Tracks the maximum book ID seen across all files.
ReindexingExecutor, which passes it to IngestionQueueManager.setupBookQueue() to populate the "books" IQueue starting from that ID. Books with IDs below the maximum are assumed to already be on disk and are skipped.