Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/wikioasis/salt/llms.txt

Use this file to discover all available pages before exploring further.

The monitoring state provisions the dedicated monitoring server with the full WikiOasis observability stack — Icinga2 for active checks and alerting, Prometheus for metrics collection, and Grafana for dashboards. It is applied exclusively to hosts matching monitoring* in top.sls, alongside monitoring.prometheus, monitoring.grafana, monitoring.nrpe_nginx, and monitoring.statsd_exporter.

Icinga2 + Icingaweb2

Active checks, IDO-MySQL backend, Director module, and Nginx/PHP-FPM front-end

Prometheus

Metric scraping with file_sd auto-discovery. All targets registered from the dns_hosts pillar

Grafana

Dashboard server pre-configured with a Prometheus datasource and served behind Nginx

Packages installed

monitoring/init.sls installs the following packages after adding the official Icinga apt repository and running apt-get install -f to fix any broken dependencies.
PackagePurpose
icinga2Monitoring and alerting engine
icinga2-ido-mysqlIDO MySQL back-end for Icinga2
icingaweb2Web UI for Icinga2
icingacliCLI tool for Icingaweb2 management
icinga-directorConfig management module for Icingaweb2
mariadb-server / mariadb-clientLocal database for IDO and Icingaweb2
nginxReverse proxy for Icingaweb2 and Grafana
php-fpmPHP FastCGI process manager
php-mysql, php-intl, php-curl, php-gd, php-mbstring, php-xmlPHP extensions required by Icingaweb2
nagios-nrpe-pluginNRPE check runner on the monitoring server itself
jq, curlUsed by notification scripts
On Debian Trixie the state falls back to the icinga-bookworm repository because Icinga does not yet publish a Trixie-specific repository. The icinga_dist variable is resolved at render time from grains['oscodename'].

Icinga2 configuration

1

APT repository

The GPG key is fetched from https://packages.icinga.com/icinga.key, dearmored into /usr/share/keyrings/icinga-archive-keyring.gpg, and the source line is written to /etc/apt/sources.list.d/icinga.list. A cmd.run guard (creates:) ensures the key is only imported once.
2

zones.conf

/etc/icinga2/zones.conf is rendered from salt://monitoring/files/icinga2/zones.conf.jinja. It creates a single Endpoint and Zone named master using the minion ID (grains['id']).
zones.conf.jinja
{%- set hostname = grains['id'] %}

object Endpoint "{{ hostname }}" { }

object Zone "master" {
  endpoints = [ "{{ hostname }}" ]
}
3

API feature (api.conf)

/etc/icinga2/features-available/api.conf is rendered from api.conf.jinja. It creates an ApiUser with full permissions and enables command/config acceptance. The feature is enabled with icinga2 feature enable api.
api.conf.jinja
{%- set api_user = salt['pillar.get']('monitoring:icinga_api_user', 'director') %}
{%- set api_password = salt['pillar.get']('monitoring:icinga_api_password') %}

object ApiUser "{{ api_user }}" {
  password = "{{ api_password }}"
  permissions = [ "*" ]
}

object ApiListener "api" {
  accept_commands = true
  accept_config   = true
}
4

IDO-MySQL feature (ido-mysql.conf)

/etc/icinga2/features-available/ido-mysql.conf is rendered from ido-mysql.conf.jinja. Pillar values for ido_db_name, ido_db_user, and the IDO password (from private pillar) are injected. The feature is enabled with icinga2 feature enable ido-mysql.
ido-mysql.conf.jinja
{%- set db_name = salt['pillar.get']('monitoring:ido_db_name', 'icingadb') %}
{%- set db_user = salt['pillar.get']('monitoring:ido_db_user', 'icingadb') %}
{%- set db_password = salt['pillar.get']('monitoring:ido_db_password') %}

library "db_ido_mysql"

object IdoMysqlConnection "ido-mysql" {
  user     = "{{ db_user }}"
  password = "{{ db_password }}"
  host     = "localhost"
  database = "{{ db_name }}"
  enable_ha = false
}
5

Notification feature

The notification feature is enabled via icinga2 feature enable notification. A creates: guard prevents re-running once /etc/icinga2/features-enabled/notification.conf exists.
6

Default conf.d cleanup

The four default configuration files that ship with icinga2 are removed to prevent conflicts with the Salt-managed host and service objects:
  • /etc/icinga2/conf.d/hosts.conf
  • /etc/icinga2/conf.d/services.conf
  • /etc/icinga2/conf.d/users.conf
  • /etc/icinga2/conf.d/notifications.conf
7

notification-commands.conf

Four NotificationCommand objects are written to /etc/icinga2/conf.d/notification-commands.conf — one each for Discord and Slack host/service notifications. Each command invokes a shell script under /etc/icinga2/scripts/ and passes context via environment variables.
notification-commands.conf (excerpt)
object NotificationCommand "notify-host-by-discord" {
  command = [ "/etc/icinga2/scripts/discord_host_notification.sh" ]
  env = {
    NOTIFICATIONTYPE = "$notification.type$"
    HOSTNAME         = "$host.name$"
    HOSTSTATE        = "$host.state$"
    HOSTOUTPUT       = "$host.output$"
    LONGDATETIME     = "$icinga.long_date_time$"
  }
}
8

salt-hosts.conf (dynamic host objects)

/etc/icinga2/conf.d/salt-hosts.conf is rendered from salt-hosts.conf.jinja. It iterates the dns_hosts pillar and generates Host, Service, and Notification objects for every registered server. Host role is inferred from the hostname prefix (e.g. proxy*, db*, mw*) and role-specific services are added automatically.
salt-hosts.conf.jinja (excerpt)
{%- set hosts = salt['pillar.get']('dns_hosts', {}) %}
{%- for hostname, host_data in hosts.items() %}
object Host "{{ hostname }}" {
  import  "generic-salt-host"
  address = "{{ host_data.ip }}"
  vars.os = "Linux"
}
{%- endfor %}
Every host gets both Discord and Slack Notification objects for each service, so alerts fire on both channels without manual configuration.

Notification webhook scripts

The notification scripts live in /etc/icinga2/scripts/ and are deployed by the state. All four scripts (discord_host, discord_service, slack_host, slack_service) source a shared config file that injects the webhook URLs from pillar.
webhook_config.sh.jinja
DISCORD_WEBHOOK_URL="{{ salt['pillar.get']('notifications:discord_webhook_url') }}"
SLACK_WEBHOOK_URL="{{ salt['pillar.get']('notifications:slack_webhook_url') }}"
The config file is written to /etc/icinga2/scripts/webhook_config.sh with mode 0640 (readable only by root and the nagios group) and each notification script requires it before running.

Icingaweb2 configuration

The Icingaweb2 configuration directory /etc/icingaweb2 is owned by www-data:icingaweb2 with mode 2770 (setgid so new files inherit the group). Four INI files are templated into it:
FilePurpose
config.iniGlobal Icingaweb2 settings; points config_resource at the icingaweb2 DB resource
resources.iniDatabase resource definitions — icinga2, icingadb, icinga_director, icingaweb2 all pointing at the remote MariaDB host
authentication.iniSets up an autologin backend and a DB-backed auth_db backend, both using the icingaweb2 resource
roles.iniGrants all users (*) full permissions (*) under the Administrators role
resources.ini (rendered example)
[icinga2]
type     = "db"
db       = "mysql"
host     = "db-other-us-east-011"
dbname   = "icingadb"
username = "icingadb"
password = "<ido_db_password>"
charset  = "utf8mb4"

Director module

The Director module is configured directly within monitoring/init.sls (not a separate state file). It:
  1. Creates /etc/icingaweb2/modules/director/ with ownership www-data:icingaweb2 and mode 2770.
  2. Writes /etc/icingaweb2/modules/director/config.ini pointing the Director at the Icinga2 API on 127.0.0.1:5665.
  3. Enables the module via icingacli module enable director (runs as www-data, idempotent with unless: guard).
modules/director/config.ini.jinja
{%- set api_user = salt['pillar.get']('monitoring:icinga_api_user', 'root') %}
{%- set api_pass = salt['pillar.get']('monitoring:icinga_api_password') %}

[db]
resource = "icingaweb2"

[icinga2]
api_host     = "127.0.0.1"
api_port     = 5665
api_username = "{{ api_user }}"
api_password = "{{ api_pass }}"

Prometheus state

monitoring/prometheus.sls installs and configures the Prometheus metrics server. It manages four categories of resources: the package, the defaults file (retention), the prometheus.yml configuration, and the file_sd target files.

Retention

Retention is controlled via ARGS in /etc/default/prometheus. The pillar key monitoring:prometheus:retention defaults to 15d in the state itself, but the public pillar default (pillar/monitoring/init.sls) sets it to 30d.
/etc/default/prometheus
ARGS="--storage.tsdb.retention.time=30d --web.enable-lifecycle"

prometheus.yml

The configuration file at /etc/prometheus/prometheus.yml defines a global scrape interval of 30s and one scrape_config per exporter type, all backed by file_sd_configs pointing at JSON files under /etc/prometheus/file_sd/. Prometheus watches the directory and reloads targets every 5 minutes.
prometheus.yml
global:
  scrape_interval:     30s
  evaluation_interval: 30s

scrape_configs:
  - job_name: 'node'
    file_sd_configs:
      - files: ['/etc/prometheus/file_sd/node.json']
        refresh_interval: 5m

  - job_name: 'mysqld'
    file_sd_configs:
      - files: ['/etc/prometheus/file_sd/mysqld.json']
        refresh_interval: 5m

  - job_name: 'haproxy'
    file_sd_configs:
      - files: ['/etc/prometheus/file_sd/haproxy.json']
        refresh_interval: 5m

  - job_name: 'redis'
    file_sd_configs:
      - files: ['/etc/prometheus/file_sd/redis.json']
        refresh_interval: 5m

  - job_name: 'statsd'
    file_sd_configs:
      - files: ['/etc/prometheus/file_sd/statsd.json']
        refresh_interval: 5m

  - job_name: 'phpfpm'
    file_sd_configs:
      - files: ['/etc/prometheus/file_sd/phpfpm.json']
        refresh_interval: 5m

  - job_name: 'opensearch'
    file_sd_configs:
      - files: ['/etc/prometheus/file_sd/opensearch.json']
        refresh_interval: 5m

file_sd auto-discovery

The file_sd directory at /etc/prometheus/file_sd/ contains one JSON file per exporter type. Each file is rendered from a Jinja template that iterates the dns_hosts pillar, filters by hostname prefix, and writes one target entry per matching host. Adding a host to dns_hosts and re-applying monitoring.prometheus is all that is required to register it as a new scrape target.
file_sd/node.json.jinja — all hosts
{%- set dns_hosts = salt['pillar.get']('dns_hosts', {}) %}
{%- set entries = [] %}
{%- for hostname, data in dns_hosts.items() %}
{%- set clean = hostname.split('.')[0] %}
{%- do entries.append('  {"targets": ["' ~ data.ip ~ ':9100"], "labels": {"instance": "' ~ clean ~ '"}}') %}
{%- endfor %}
[
{{ entries | join(',\n') }}
]
file_sd/mysqld.json.jinja — db* hosts only
{%- set dns_hosts = salt['pillar.get']('dns_hosts', {}) %}
{%- set entries = [] %}
{%- for hostname, data in dns_hosts.items() if hostname.startswith('db') %}
{%- set clean = hostname.split('.')[0] %}
{%- do entries.append('  {"targets": ["' ~ data.ip ~ ':9104"], "labels": {"instance": "' ~ clean ~ '"}}') %}
{%- endfor %}
[
{{ entries | join(',\n') }}
]
The complete set of scrape jobs and their target filters:
JobPortTarget filter (dns_hosts prefix)
node9100All hosts
mysqld9104db*
haproxy9101proxy*
redis9121redis*
statsd9102monitoring*
phpfpm9253apps*, mw*, staging*
opensearch9114opensearch*

Grafana state

monitoring/grafana.sls installs Grafana from the official APT repository and configures it with a pre-provisioned Prometheus datasource.

grafana.ini

The main configuration file is rendered from grafana.ini.jinja. It sets the HTTP port to 3000, locks down sign-ups and anonymous access, and injects the admin credentials from pillar.
grafana.ini.jinja
[server]
http_port = 3000
domain    = grafana.wikioasis.org
root_url  = %(protocol)s://%(domain)s/
serve_from_sub_path = false

[database]
type = sqlite3
path = grafana.db

[security]
admin_user     = {{ admin_user }}
admin_password = {{ admin_pass }}

[users]
allow_sign_up = false

[auth.anonymous]
enabled = false

Prometheus datasource provisioning

/etc/grafana/provisioning/datasources/prometheus.yml is written from datasource.yml.jinja at apply time. This means Grafana starts with the Prometheus datasource already registered — no manual UI steps required.

Nginx vhost

A dedicated Nginx site grafana.conf is enabled (symlinked into sites-enabled/) and triggers an Nginx reload on change via watch_in.

Pillar reference

The following keys from the public pillar (pillar/monitoring/init.sls) are consumed by the monitoring state. Passwords and other secrets are stored in the private pillar and are not listed here — see the private pillar reference for those values.
Pillar keyDefaultDescription
monitoring:icinga_api_userrootUsername for the Icinga2 API user object
monitoring:ido_db_nameicingadbMariaDB database name for IDO
monitoring:ido_db_usericingadbMariaDB user for IDO
monitoring:web_db_nameicingawebMariaDB database for Icingaweb2 session/config
monitoring:director_db_nameicingawebMariaDB database for Director module
monitoring:director_db_usericingadbMariaDB user for Director
monitoring:grafana:admin_useradminGrafana admin username
monitoring:prometheus:retention30dTSDB retention period passed to --storage.tsdb.retention.time
Pillar keyDescription
notifications:discord_webhook_urlDiscord incoming webhook URL for Icinga2 alerts
notifications:slack_webhook_urlSlack incoming webhook URL for Icinga2 alerts
pillar/monitoring/init.sls (defaults)
monitoring:
  icinga_api_user: root
  ido_db_name: icingadb
  ido_db_user: icingadb
  web_db_name: icingaweb
  director_db_name: icingaweb
  director_db_user: icingadb

  grafana:
    admin_user: admin

  prometheus:
    retention: 30d
Passwords for IDO, Director, the Icinga2 API, Grafana admin, and both webhook URLs are stored in the private encrypted pillar with no defaults. The state will error at render time if any are missing.

Apply commands

Apply everything assigned to the monitoring server (monitoring state + all sub-states via top.sls):
salt 'monitoring*' state.apply
After adding a new host to the dns_hosts pillar, run salt 'monitoring*' state.apply monitoring.prometheus to regenerate the file_sd JSON files. Prometheus will pick up the new targets within the 5-minute refresh_interval without a restart because --web.enable-lifecycle is set.

Build docs developers (and LLMs) love