GuancheData ships a dedicatedDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/GuancheData/stage_3/llms.txt
Use this file to discover all available pages before exploring further.
benchmark service in docker-compose.yml that measures three core performance dimensions: how fast the system ingests documents, how fast it indexes tokens, and how quickly the cluster recovers after a node failure. A fourth dimension — query latency under concurrent load — is measured separately using Apache JMeter. Each benchmark mode is controlled by a single environment variable and runs as a Docker Compose profile, making it easy to reproduce experiments at different cluster sizes.
Starting the benchmark service
SetBENCHMARK_MODE in docker-compose.yml before starting the benchmark container:
recoverytime, which runs indefinitely and waits for a node to be removed).
Benchmark modes
- ingestionrate
- indexingthroughput
- recoverytime
Measures how many books per second the ingestion service downloads and stores in the datalake.Mechanism: What to vary: Run with 1, 2, and 4 ingestion containers to observe horizontal scalability. Note the back-pressure effect of
IngestionRate connects to the cluster as a Hazelcast client (not a full member). It accesses the "log" ISet — a BookDownloadLog that ingestion nodes append to as each book is stored. Every second, it reads the current size of the ISet, waits one second, reads the size again, and divides the difference by the elapsed time to compute a rate in books/second.Iterations: 15 total — 5 warmup iterations (discarded) followed by 10 measured iterations. The warmup period allows the ingestion pipeline to reach steady-state throughput before recording results.Output:INDEXING_BUFFER_FACTOR — if indexers are slower than ingestion, the measured rate will plateau even as ingestion nodes are added.Full benchmarking workflow
Build and deploy the cluster
From the repository root, build all service JARs and start the full stack on the main node:Wait until all containers are healthy and the Hazelcast cluster reports all members joined.
Allow the datalake to populate
Let the ingestion service run for several minutes to download and store a representative number of books. A larger datalake produces more stable benchmark results and makes recovery time measurements more meaningful, because there is more partition data to rebalance.
Run ingestion rate benchmark
Set Record the final
BENCHMARK_MODE: ingestionrate in docker-compose.yml and start the benchmark:IngestionRate mean and standard deviation. Then scale to 2 ingestion nodes by starting the backend profile on a second machine and repeating.Run indexing throughput benchmark
Update Each iteration takes 10 seconds; the full run completes in approximately 150 seconds.
BENCHMARK_MODE: indexingthroughput and restart the benchmark container:Run recovery time benchmark
Update Once the benchmark logs Read the recovery time from the benchmark logs. Restart the stopped container and wait for the cluster to stabilize before triggering the next failure.
BENCHMARK_MODE: recoverytime and restart:CLUSTER IS SAFE, simulate a failure:Run query latency benchmark with JMeter
With the cluster running and the index populated, open Apache JMeter and load the test plan:Configure the thread count to simulate the desired number of concurrent users and run the test against the Nginx endpoint (
http://<NGINX_IP>:8080). JMeter reports mean latency, percentile latencies (p95, p99), and throughput in requests/second. The /benchmarks directory also contains previous result datasets and logs for comparison.