Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/wikioasis/salt/llms.txt

Use this file to discover all available pages before exploring further.

Metrics and health checks in WikiOasis are distributed across every server in the fleet. Prometheus exporters expose machine- and service-level metrics that the monitoring server scrapes; NRPE (Nagios Remote Plugin Executor) agents run active checks that Icinga2 polls. Both systems are wired up automatically via top.sls — role-specific exporters and checks are applied to matching host groups without any manual targeting.

Prometheus Exporters

Seven exporter types expose metrics on fixed ports. The monitoring server’s file_sd targets are auto-generated from the dns_hosts pillar.

NRPE Checks

nrpe_common runs on every host; role-specific check states add service-level probes on top.

Prometheus exporters overview

Exporter statePortDeployed toPackage / binary
monitoring.node_exporter9100All servers (*)prometheus-node-exporter
monitoring.mysqld_exporter9104db*prometheus-mysqld-exporter
monitoring.haproxy_exporter9101proxy*prometheus-haproxy-exporter
monitoring.redis_exporter9121redis*prometheus-redis-exporter
monitoring.statsd_exporter9102monitoring*Binary from GitHub releases
monitoring.phpfpm_exporter9253apps*, mw*, staging*, task*Binary from GitHub releases
monitoring.opensearch_exporter9114opensearch*prometheus-elasticsearch-exporter
phpfpm_exporter and statsd_exporter are not available in the Debian apt repository and are installed by extracting upstream release archives. Their systemd unit files are managed directly by Salt.

node_exporter (port 9100)

monitoring/node_exporter.sls is applied to every server via top.sls ('*' matcher). It installs prometheus-node-exporter from the Debian repository and ensures the service is running and enabled.
node_exporter.sls
node_exporter_pkg:
  pkg.installed:
    - name: prometheus-node-exporter

prometheus-node-exporter:
  service.running:
    - enable: True
    - require:
      - pkg: node_exporter_pkg
Apply
salt '*' state.apply monitoring.node_exporter

mysqld_exporter (port 9104)

monitoring/mysqld_exporter.sls is applied to db* servers. It installs prometheus-mysqld-exporter and writes a .my.cnf credential file at /etc/prometheus/mysqld.my.cnf, owned root:prometheus with mode 0640. The exporter connects to MariaDB on 127.0.0.1:3306 as the prom_exporter user. The password is sourced from monitoring:mysqld_exporter_password in the private pillar.
/etc/prometheus/mysqld.my.cnf (rendered)
[client]
user     = prom_exporter
password = <mysqld_exporter_password>
host     = 127.0.0.1
port     = 3306
The ARGS in /etc/default/prometheus-mysqld-exporter point the exporter at that file:
/etc/default/prometheus-mysqld-exporter
ARGS="--config.my-cnf=/etc/prometheus/mysqld.my.cnf"
Apply
salt 'db*' state.apply monitoring.mysqld_exporter

haproxy_exporter (port 9101)

monitoring/haproxy_exporter.sls is applied to proxy* servers. It installs prometheus-haproxy-exporter and configures it to scrape HAProxy metrics via the Unix stats socket at /run/haproxy/admin.sock. Because the socket has mode 660 and group haproxy, the state adds the prometheus system user to the haproxy group:
haproxy_exporter.sls (group membership)
prometheus_in_haproxy_group:
  user.present:
    - name: prometheus
    - groups:
      - haproxy
    - remove_groups: False
/etc/default/prometheus-haproxy-exporter
ARGS="--haproxy.scrape-uri=unix:/run/haproxy/admin.sock"
Apply
salt 'proxy*' state.apply monitoring.haproxy_exporter

redis_exporter (port 9121)

monitoring/redis_exporter.sls is applied to redis* servers. It installs prometheus-redis-exporter from the Debian repository and starts the service with no additional configuration — the exporter connects to Redis on localhost:6379 by default.
Apply
salt 'redis*' state.apply monitoring.redis_exporter

opensearch_exporter (port 9114)

monitoring/opensearch_exporter.sls is applied to opensearch* servers. It uses the prometheus-elasticsearch-exporter package (which is API-compatible with OpenSearch) and configures it to connect to the local OpenSearch instance on port 9200.
/etc/default/prometheus-elasticsearch-exporter
ARGS="--es.uri=http://localhost:9200"
Apply
salt 'opensearch*' state.apply monitoring.opensearch_exporter

phpfpm_exporter (port 9253)

monitoring/phpfpm_exporter.sls is applied to apps*, mw*, staging*, and task* servers. Because there is no Debian package, Salt downloads the upstream release archive from GitHub and installs it under /opt/phpfpm_exporter/, then creates a symlink at /usr/local/bin/prometheus-phpfpm-exporter.
phpfpm_exporter.sls (archive install)
phpfpm_exporter_binary:
  archive.extracted:
    - name: /opt/phpfpm_exporter
    - source: https://github.com/hipages/php-fpm_exporter/releases/download/v2.2.0/php-fpm_exporter_2.2.0_linux_amd64.tar.gz
    - source_hash: sha256=b1c207fcd89f9be20104fd90bc76b3c584987ea5a769c99d5759f79af8322449
    - if_missing: /opt/phpfpm_exporter/php-fpm_exporter
The systemd unit file is fully managed by Salt. The PHP version and socket path are resolved from the php:version pillar (defaulting to 8.3):
prometheus-phpfpm-exporter.service (rendered for PHP 8.3)
[Unit]
Description=Prometheus PHP-FPM Exporter
After=network.target php8.3-fpm.service

[Service]
User=www-data
ExecStart=/usr/local/bin/prometheus-phpfpm-exporter server \
  --phpfpm.fix-process-count \
  --phpfpm.scrape-uri "unix:///run/php/php8.3-fpm.sock;/status"
Restart=on-failure

[Install]
WantedBy=multi-user.target
Apply
salt 'apps* or mw* or staging* or task*' state.apply monitoring.phpfpm_exporter

statsd_exporter (port 9102)

monitoring/statsd_exporter.sls is applied only to monitoring* servers. Like phpfpm_exporter, there is no Debian package — Salt extracts the binary from the upstream GitHub release into /opt/statsd_exporter/ and symlinks it to /usr/local/bin/prometheus-statsd-exporter. A prometheus system user (no login shell, home /var/lib/prometheus) is created before the service starts:
statsd_exporter.sls (user creation)
prometheus_user:
  user.present:
    - name: prometheus
    - system: True
    - shell: /usr/sbin/nologin
    - home: /var/lib/prometheus
    - createhome: False
prometheus-statsd-exporter.service
[Unit]
Description=Prometheus StatsD Exporter
After=network.target

[Service]
User=prometheus
ExecStart=/usr/local/bin/prometheus-statsd-exporter
Restart=on-failure

[Install]
WantedBy=multi-user.target
Apply
salt 'monitoring*' state.apply monitoring.statsd_exporter

NRPE system

NRPE (Nagios Remote Plugin Executor) is the active-check agent that Icinga2 uses to run checks on remote hosts. The base agent (monitoring.nrpe) is installed on every server, then role-specific drop-in states add check definitions to /etc/nagios/nrpe.d/.

Base NRPE agent (monitoring.nrpe)

monitoring/nrpe/init.sls installs four packages and writes the main nrpe.cfg:
  • nagios-nrpe-server — the NRPE daemon itself
  • monitoring-plugins-basic — standard check plugins
  • monitoring-plugins-standard
  • monitoring-plugins-contrib
The nrpe.cfg template dynamically builds the allowed_hosts list by filtering the dns_hosts pillar for hosts whose names start with monitoring. All drop-in check definitions are loaded from include_dir=/etc/nagios/nrpe.d.
nrpe.cfg (rendered, excerpt)
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=127.0.0.1,::1,10.0.0.5

command[check_load]=/usr/lib/nagios/plugins/check_load -r -w 0.80,0.80,0.80 -c 1.00,1.00,1.00
command[check_disk_root]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
command[check_disk_srv]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /srv
command[check_procs]=/usr/lib/nagios/plugins/check_procs -w 700 -c 1000
command[check_swap]=/usr/lib/nagios/plugins/check_swap -w 40% -c 20%
include_dir=/etc/nagios/nrpe.d
Apply
salt '*' state.apply monitoring.nrpe

nrpe_common — checks on all servers

monitoring/nrpe_common.sls deploys two drop-in check definitions to every server:
Check commandPluginDescription
check_memcheck_mem.shMemory usage — warn at 95%, crit at 100%
check_aptcheck_apt (standard plugin)Pending package upgrades
/etc/nagios/nrpe.d/mem.cfg
command[check_mem]=/usr/lib/nagios/plugins/check_mem.sh 95 100
Apply
salt '*' state.apply monitoring.nrpe_common

nrpe_salt — salt-minion check (all servers)

monitoring/nrpe_salt.sls deploys check_systemd_service.sh (a generic systemd unit status script) to all servers and registers check_salt_minion:
/etc/nagios/nrpe.d/salt_minion.cfg
command[check_salt_minion]=/usr/lib/nagios/plugins/check_systemd_service.sh salt-minion
Apply
salt '*' state.apply monitoring.nrpe_salt

nrpe_salt_master — salt-master check (salt* servers)

monitoring/nrpe_salt_master.sls is applied only to salt* servers. It adds the check_salt_master command using the same check_systemd_service.sh script deployed by nrpe_salt. The script is required to already be present (installed by nrpe_salt) before the drop-in is written.
/etc/nagios/nrpe.d/salt_master.cfg
command[check_salt_master]=/usr/lib/nagios/plugins/check_systemd_service.sh salt-master
Apply
salt 'salt*' state.apply monitoring.nrpe_salt_master

nrpe_haproxy — HAProxy backend checks (proxy* servers)

monitoring/nrpe_haproxy.sls adds two custom HAProxy check scripts and the haproxy.cfg drop-in. The nagios user is added to the haproxy group so it can read the stats socket (/run/haproxy/admin.sock, mode 660 haproxy:haproxy).
Check commandScriptDescription
check_haproxycheck_haproxy.shOverall HAProxy health via stats socket
check_haproxy_backendscheck_haproxy_backends.shStatus of all configured backends
Apply
salt 'proxy*' state.apply monitoring.nrpe_haproxy

nrpe_mediawiki — MediaWiki HTTP health check (mw*, staging*)

monitoring/nrpe_mediawiki.sls deploys check_mediawiki.sh and registers the check_mediawiki command. The script performs an HTTP health check against the local MediaWiki installation.
Apply
salt 'mw* or staging*' state.apply monitoring.nrpe_mediawiki

nrpe_metal — RAID and SMART disk checks (metal* servers)

monitoring/nrpe_metal.sls handles bare-metal disk monitoring. It installs smartmontools, creates a sudoers entry allowing nagios to run smartctl without a password, and deploys two check scripts:
Check commandScriptDescription
check_smartcheck_smart.shSMART disk health
check_raidcheck_raid.shSoftware RAID array status
/etc/sudoers.d/nagios-smartctl
nagios ALL=(root) NOPASSWD: /usr/sbin/smartctl
Apply
salt 'metal*' state.apply monitoring.nrpe_metal

nrpe_nginx — Nginx error log check (nginx servers)

monitoring/nrpe_nginx.sls adds the nagios user to the adm group (which has read access to /var/log/nginx/) and deploys check_nginx_errors.sh along with nginx.cfg.
Check commandScriptDescription
check_nginx_errorscheck_nginx_errors.shNginx error log rate check
Applied to apps*, mw*, staging*, task*, and monitoring* servers (any host running Nginx).
Apply
salt 'apps* or mw* or staging* or task* or monitoring*' state.apply monitoring.nrpe_nginx

nrpe_opensearch — OpenSearch cluster health (opensearch* servers)

monitoring/nrpe_opensearch.sls deploys check_opensearch.sh and the opensearch.cfg drop-in.
Check commandScriptDescription
check_opensearchcheck_opensearch.shOpenSearch cluster health (green/yellow/red)
Apply
salt 'opensearch*' state.apply monitoring.nrpe_opensearch

nrpe_php — PHP-FPM pool and error log checks (apps*, mw*, staging*, task*)

monitoring/nrpe_php.sls installs libfcgi-bin (required to query the FPM status page via FastCGI), adds nagios to the adm group for log access, and deploys two check scripts with templated .cfg files (the PHP version and pool name are resolved from the php pillar).
Check commandScriptDescription
check_php_fpmcheck_php_fpm.shPHP-FPM pool status page (process count, queue depth)
check_php_errorscheck_php_errors.shPHP error log rate
Apply
salt 'apps* or mw* or staging* or task*' state.apply monitoring.nrpe_php

nrpe_redis — Redis ping check (redis* servers)

monitoring/nrpe_redis.sls deploys check_redis.sh and the redis.cfg.jinja drop-in (templated for the configured Redis port/socket).
Check commandScriptDescription
check_redischeck_redis.shRedis PING/PONG health check
Apply
salt 'redis*' state.apply monitoring.nrpe_redis

top.sls assignments

The following excerpt from top.sls shows how all exporter and NRPE states are distributed:
top.sls (monitoring-related assignments)
base:
  '*':
    - monitoring.nrpe
    - monitoring.nrpe_common
    - monitoring.nrpe_salt
    - monitoring.node_exporter
  'apps*':
    - monitoring.nrpe_nginx
    - monitoring.nrpe_php
    - monitoring.phpfpm_exporter
  'db*':
    - monitoring.mysqld_exporter
  'metal*':
    - monitoring.nrpe_metal
  'proxy*':
    - monitoring.nrpe_haproxy
    - monitoring.haproxy_exporter
  'monitoring*':
    - monitoring
    - monitoring.director
    - monitoring.nrpe_nginx
    - monitoring.prometheus
    - monitoring.grafana
    - monitoring.statsd_exporter
  'mw* or staging*':
    - match: compound
    - monitoring.nrpe_nginx
    - monitoring.nrpe_php
    - monitoring.nrpe_mediawiki
    - monitoring.phpfpm_exporter
  'task*':
    - monitoring.nrpe_nginx
    - monitoring.nrpe_php
    - monitoring.phpfpm_exporter
  'opensearch*':
    - monitoring.nrpe_opensearch
    - monitoring.opensearch_exporter
  'redis*':
    - monitoring.nrpe_redis
    - monitoring.redis_exporter
  'salt*':
    - monitoring.nrpe_salt_master
To apply all monitoring states to every server in the fleet at once (e.g. after a top.sls change), run a full highstate: salt '*' state.apply. Each minion will only pick up the states assigned to it.
monitoring.phpfpm_exporter and monitoring.statsd_exporter download binaries directly from GitHub. Ensure the target servers have outbound HTTPS access to github.com when these states are first applied, or pre-stage the archives and update the source: URLs in the respective .sls files.

Build docs developers (and LLMs) love