All runtime configuration for GuancheData is supplied through environment variables defined inDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/GuancheData/stage_3/llms.txt
Use this file to discover all available pages before exploring further.
docker-compose.yml. There is no external configuration file to edit beyond the Compose file itself and nginx.conf. Each service reads its variables at container startup, so changes require a container restart. The sections below document every variable for each service, including defaults, accepted values, and the effect of tuning key parameters.
ingestion-service
ingestion-service
The ingestion service crawls documents and writes them to a replicated datalake. It also publishes indexing events to ActiveMQ and participates in the Hazelcast cluster.
Hazelcast member port this container listens on. Must match the port in
HZ_PUBLIC_ADDRESS and be exposed via Docker’s ports mapping. Typical value: 5701.Publicly advertised Hazelcast address in
host:port format. Other cluster members use this address to reach this node. Set to the host machine’s LAN IP and the same port as HZ_PORT (e.g. 192.168.1.10:5701).Comma-separated list of seed member addresses for Hazelcast cluster discovery (e.g.
192.168.1.10:5701). All services across all nodes should point to the same seed — typically the ingestion-service on the main node.Logical name of the Hazelcast cluster. All services in the cluster must share the same value. Change this only if you need to run isolated clusters on the same network.
ActiveMQ connection URL. Inside a single-node deployment the default
tcp://activemq:61616 resolves via the Docker network. In a multi-node deployment, replace activemq with the IP of the node running the broker profile (e.g. tcp://192.168.1.10:61616).Number of filesystem replicas written for each ingested document in the datalake. A value of
2 means each document is stored on two nodes. Higher values improve fault tolerance at the cost of storage and write latency. Must be less than or equal to the number of ingestion-service instances in the cluster.Maximum number of datalake entries buffered per indexer before the ingestion service pauses and waits for the indexers to catch up. Lower values reduce memory pressure on indexers under burst load; higher values allow more ingestion parallelism at the cost of increased queue depth.
indexing-service
indexing-service
The indexing service consumes messages from ActiveMQ, reads documents from the shared datalake volume, and builds the distributed in-memory inverted index in Hazelcast.
Hazelcast member port for this service. Must be distinct from other services on the same host. Typical value:
5702.Publicly advertised Hazelcast address in
host:port format. Set to the host machine’s LAN IP combined with HZ_PORT (e.g. 192.168.1.10:5702).Comma-separated seed member addresses for cluster discovery. Should point to the ingestion-service seed on the main node (e.g.
192.168.1.10:5701).Must match the cluster name used by all other services. See the ingestion-service entry for details.
ActiveMQ connection URL. Replace the hostname with the broker node’s IP in multi-node deployments (e.g.
tcp://192.168.1.10:61616).search-service
search-service
The search service exposes an HTTP API that queries the distributed Hazelcast inverted index and returns ranked results. It participates in the Hazelcast cluster as a full peer member and is fronted by the Nginx load balancer.
Hazelcast member port for this service. Must be distinct from other services on the same host. Typical value:
5703.HTTP port on which the search API listens inside the container. This must be exposed via Docker’s
ports mapping and match the port configured in nginx.conf for the search_backend upstream.Publicly advertised Hazelcast address in
host:port format. Set to the host machine’s LAN IP combined with HZ_PORT (e.g. 192.168.1.10:5703).Comma-separated seed member addresses for cluster discovery. Should point to the ingestion-service seed on the main node (e.g.
192.168.1.10:5701).Logical Hazelcast cluster name. Note that this service uses
CLUSTER_NAME rather than HAZELCAST_CLUSTER_NAME. The value must still match the cluster name used by the ingestion and indexing services.Result ranking strategy applied to search results. Accepted values:
frequency— results are sorted by term frequency in descending order (most relevant first).id— results are sorted by document ID in ascending order.