Common Infrastructure Maintenance Operations for WikiOasis

This runbook covers the day-to-day maintenance operations you will perform most frequently on the WikiOasis infrastructure: scheduling monitoring downtime, removing servers from the HAProxy load-balancer pool, applying configuration changes through Salt, and validating that minions and their pillar data are healthy. All operations use either native Salt commands or the custom execution modules (haproxy and host) built into this repository.

Scheduling downtime in Icinga2

Before performing any disruptive work on a server — deployments, restarts, configuration changes — schedule a downtime period in Icinga2 to suppress alerts. The host.downtime Salt module communicates with the Icinga2 REST API directly, covering the host object and all of its service checks in a single call. The duration argument accepts a plain integer (seconds) or a string with a unit suffix: s (seconds), m (minutes), h (hours), d (days).

# 2-hour downtime for a deployment
salt 'mw-us-east-011*' host.downtime 'mw-us-east-011.ovvin.wonet' '2h' 'Scheduled deployment'

# 30-minute downtime for a quick service restart
salt 'mw-us-east-011*' host.downtime 'mw-us-east-011.ovvin.wonet' '30m' 'PHP-FPM restart'

# 1-day downtime for extended maintenance
salt 'db-c1-us-east-021*' host.downtime 'db-c1-us-east-021.ovvin.wonet' '1d' 'Extended maintenance window'

The module reads monitoring:icinga_api_host, monitoring:icinga_api_user, monitoring:icinga_api_password, and monitoring:icinga_api_port from pillar. These must be present in the private pillar before the module can connect.

The hostname argument must match the Icinga2 host object name exactly, not the minion ID. In the WikiOasis setup the convention is <minion-id>.ovvin.wonet.

Depooling and repooling servers

The haproxy Salt module manages HAProxy’s runtime state via the stats socket at /run/haproxy/admin.sock. All operations take effect immediately — no HAProxy reload or restart is required.

Depooling a server

Remove a server from a backend before performing maintenance to ensure it receives no new traffic:

# Depool mw-us-east-011 from the mediawiki backend
salt 'proxy*' haproxy.depool mediawiki mw-us-east-011

Repooling a server

Return a server to the pool after maintenance is complete:

# Repool mw-us-east-011 into the mediawiki backend
salt 'proxy*' haproxy.repool mediawiki mw-us-east-011

Checking pool status

Inspect the current state of all backend servers across all proxy nodes:

salt 'proxy*' haproxy.status

The output is a list of dicts, one per server, with backend, server, status, and weight fields. A healthy server shows status: UP.

Combined maintenance workflow

The recommended sequence for any maintenance that requires traffic isolation is: schedule downtime, depool, do the work, repool, and verify.

# 1. Schedule downtime in Icinga2 (suppresses alerts for 2 hours)
salt 'mw-us-east-011*' host.downtime 'mw-us-east-011.ovvin.wonet' '2h' 'Scheduled maintenance'

# 2. Depool from HAProxy (takes effect immediately, no reload needed)
salt 'proxy*' haproxy.depool mediawiki mw-us-east-011

# 3. Perform maintenance work...
#    e.g. apply a state, restart a service, update packages

# 4. Repool when maintenance is complete
salt 'proxy*' haproxy.repool mediawiki mw-us-east-011

# 5. Verify pool status
salt 'proxy*' haproxy.status

Always depool before making changes that could cause request failures. Never skip straight to the work — even a state.apply that touches nginx or PHP-FPM will cause a brief service interruption.

Applying configuration changes

Single state

Apply one specific state to one or more minions:

salt '<target>' state.apply <state>

# Examples
salt 'mw-us-east-031' state.apply nginx
salt 'db*' state.apply mariadb.backup
salt 'proxy*' state.apply haproxy

Highstate (all states)

Apply every state assigned to a minion via salt/top.sls:

salt '<target>' state.highstate

# Example
salt 'mw-us-east-031' state.highstate

Dry run (test mode)

Preview what changes Salt would make without applying them. Always use this before running a highstate on a production server you are unsure about:

salt '<target>' state.apply <state> test=True

# Example
salt 'mw-us-east-031' state.highstate test=True

Test mode output uses colour coding in a terminal: green lines are states that would succeed with no change, yellow lines would make changes, and red lines would fail. Investigate any red lines before proceeding.

Verbose debug output

Add -l debug to see the full Salt execution log, including rendered Jinja templates and exact file paths:

salt '<target>' state.apply <state> -l debug

Checking minion connectivity

Use test.ping to verify that one or all minions are reachable before starting any bulk operation:

# Ping all minions
salt '*' test.ping

# Ping with a longer timeout (useful if many minions are slow to respond)
salt '*' test.ping --timeout=30

# Ping a single minion
salt 'mw-us-east-031' test.ping

A return of True means the minion is up and the master can communicate with it. No response (or a timeout) means the minion is down, the salt-minion service is stopped, or there is a network issue.

Verifying pillar data

Pillar data drives nearly every decision Salt makes. Verifying pillar output before applying states is essential — especially after editing pillar/top.sls or adding a new host-specific pillar file.

# Show all pillar data for a minion
salt '<minion>' pillar.items

# Show a specific pillar subtree
salt '<minion>' pillar.get mariadb

# Examples
salt 'db-c1-us-east-021' pillar.items
salt 'db-c1-us-east-021' pillar.get mariadb:backup
salt 'proxy-us-east-011' pillar.get haproxy

If pillar.get returns an empty dict or None for a key you expect to be populated, check:

The correct glob is in pillar/top.sls.
The per-host .sls file exists in the right directory.
The private pillar is present on the master and contains the expected keys.

Managing HAProxy routes

HAProxy hostname-to-backend routing is stored in /etc/haproxy/routes.map. The haproxy module lets you inspect and modify these routes at runtime without touching any config files or reloading HAProxy.

List all active routes

salt 'proxy*' haproxy.route_list

Add or update a route dynamically

# Route a new wiki hostname to the mediawiki backend — takes effect immediately
salt 'proxy*' haproxy.route_set newwiki.example.com mediawiki

Dynamic route changes via haproxy.route_set are applied to the live HAProxy instance only and do not survive a restart. To persist the route, update the haproxy:routes pillar key and re-apply the HAProxy route state:

salt 'proxy*' state.apply haproxy.route

Remove a route

salt 'proxy*' haproxy.route_del oldwiki.example.com

Quick reference

Task	Command
Schedule 2h downtime	`salt '<minion>' host.downtime '<hostname>.ovvin.wonet' '2h' '<reason>'`
Depool from backend	`salt 'proxy*' haproxy.depool <backend> <server>`
Repool to backend	`salt 'proxy*' haproxy.repool <backend> <server>`
Check pool status	`salt 'proxy*' haproxy.status`
Apply a state	`salt '<target>' state.apply <state>`
Apply all states	`salt '<target>' state.highstate`
Dry-run a state	`salt '<target>' state.apply <state> test=True`
Ping all minions	`salt '*' test.ping`
View all pillar data	`salt '<minion>' pillar.items`
View pillar subtree	`salt '<minion>' pillar.get <key>`
List live routes	`salt 'proxy*' haproxy.route_list`
Add a route (live)	`salt 'proxy*' haproxy.route_set <hostname> <backend>`

Adding a Server

Full runbook for provisioning a new minion, accepting its key, and applying role states.

Database Backup

Reference for the MariaDB backup system, schedule, scripts, and NRPE health checks.

Deployment

Salt Modules

Runbooks

Common Infrastructure Maintenance Operations for WikiOasis

Scheduling downtime in Icinga2

Depooling and repooling servers

Depooling a server

Repooling a server

Checking pool status

Combined maintenance workflow

Applying configuration changes

Single state

Highstate (all states)

Dry run (test mode)

Verbose debug output

Checking minion connectivity

Verifying pillar data

Managing HAProxy routes

List all active routes

Add or update a route dynamically

Remove a route

Quick reference

Adding a Server

Database Backup

Build docs developers (and LLMs) love

Deployment

Salt Modules

Runbooks

Documentation Index

​Scheduling downtime in Icinga2

​Depooling and repooling servers

​Depooling a server

​Repooling a server

​Checking pool status

​Combined maintenance workflow

​Applying configuration changes

​Single state

​Highstate (all states)

​Dry run (test mode)

​Verbose debug output

​Checking minion connectivity

​Verifying pillar data

​Managing HAProxy routes

​List all active routes

​Add or update a route dynamically

​Remove a route

​Quick reference

Adding a Server

Database Backup

Build docs developers (and LLMs) love

Scheduling downtime in Icinga2

Depooling and repooling servers

Depooling a server

Repooling a server

Checking pool status

Combined maintenance workflow

Applying configuration changes

Single state

Highstate (all states)

Dry run (test mode)

Verbose debug output

Checking minion connectivity

Verifying pillar data

Managing HAProxy routes

List all active routes

Add or update a route dynamically

Remove a route

Quick reference