Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/wikioasis/salt/llms.txt

Use this file to discover all available pages before exploring further.

A rolling deploy pushes updated MediaWiki code to production servers one at a time (or in small batches), temporarily removing each server from HAProxy before syncing and returning it to the pool only after a canary health check passes. This page walks through the full procedure from pre-deploy checks through post-deploy verification, including what to do when something goes wrong.

Pre-Deploy Checklist

Before starting a deploy, confirm that the environment is in a healthy baseline state. Deploying into an already-degraded fleet risks turning a partial outage into a complete one.
1

Verify all servers are pooled and healthy in HAProxy

On any proxy server, check the current pool status using the Salt haproxy module:
salt 'proxy*' haproxy.status
Every application server in the mediawiki backend should show status: UP. Investigate any server showing DOWN, MAINT, or DRAIN before proceeding.
2

Check monitoring for active alerts

Review the Icinga2 dashboard for any open CRITICAL or WARNING alerts on MediaWiki application servers or their upstream services (databases, object caches, NFS). A deploy that touches an already-sick service can mask the underlying issue and complicate diagnosis.
3

Confirm staging is up to date

On the staging server, verify that the branches you intend to deploy are at the expected commits:
sudo -u www-data git -C /srv/mediawiki-staging log --oneline -5
sudo -u www-data git -C /srv/mediawiki-staging/versions/1.45 log --oneline -5
4

Announce the deploy

Post a heads-up in the ops channel. mwdeploy will send its own start/finish/failure notifications to the configured Discord or Slack webhook, but a human announcement helps the team know to hold off on unrelated changes.

Running the Deploy

1. SSH to the staging server as mwdeploy

All deployments must be run from the staging server. The mwdeploy user owns the staging checkout and holds the SSH key trusted by all production servers.
ssh mwdeploy@staging-us-east-021.ovvin.wonet

2. Choose your deploy flags

Pick the flags that match what changed. Minimal flag sets produce faster, lower-risk deploys.
Pull core and the active MediaWiki version, rebuild the localisation cache, then roll out to all servers through HAProxy:
mwdeploy --core --mediawiki --l10n --servers all --rollout

3. Watch the TUI dashboard

The curses dashboard opens automatically when you pass deploy flags. It shows live status for every server.
mwdeploy                                          2025-01-15T14:32:10Z
──────────────────────────────────────────────────────────────────────
Preparation  ✓ done  [1m 12s]
  14:33:22  Staging ready ✓

Deploying  (1/4 done)
──────────────────────────────────────────────────────────────────────
mw-us-east-011        ✓ done  [45s]
  14:33:45  Deploy complete

mw-us-east-012        ⟳ running  [23s]
  [████████░░░░░░░░░░░░░░░░]  33%  rsync
  14:33:55  Syncing files…

mw-us-east-021        ○ pending
mw-us-east-022        ○ pending

  [↑↓] select   [Ctrl+O] detail logs   [Ctrl+R] depool/repool   [Ctrl+X] exit
Use / to select a server and Ctrl+O to open the full log view for that server.

The Canary Check

Before deploying to any production server, mwdeploy performs a canary check on the staging host itself. It then repeats the check on each production server immediately after rsyncing and before repooling. The check works as follows:
  1. curl fetches http://127.0.0.1/wiki/Main_Page on the target host with a Host: test.wikioasis.org header and a 15-second timeout.
  2. The HTTP response code must be 200.
  3. The response body must contain the string wikioasis (case-insensitive).
  4. Up to 3 attempts are made with a 3-second pause between failures.
A passing canary confirms that MediaWiki is serving pages correctly on that host before it receives live traffic. If the canary fails on staging, the deploy is aborted before touching any production server. If it fails on a production server, the server is either repooled (and the failure reported) or the operator is prompted to continue.

Canary Failure Prompt

When a canary fails and --force is not set, the dashboard freezes and displays an interactive prompt:
  Canary failed on mw-us-east-012 (HTTP 502). Continue?

  [Y]  Yes, continue
  [N]  Stop here
  • Press Y to skip this server and continue deploying to the next. The failed server’s status is set to failed and it is repooled immediately (if --rollout was set).
  • Press N to abort. All pending servers are marked skipped. Any server already in running state finishes its current step before stopping.
If you know the canary will fail (e.g. the wiki is in maintenance mode) and still want to push, use --force to bypass prompts entirely.

HAProxy Depool/Repool Integration

When --rollout is passed, mwdeploy integrates with HAProxy’s runtime socket to take each server out of rotation during its deploy window, avoiding sending user requests to a server mid-rsync. The sequence for each server is:
depool ──► rsync ──► [l10n] ──► canary ──► repool
  1. Depooldisable server mediawiki/<server> is sent to every proxy in proxy_servers via socat stdio /run/haproxy/admin.sock. Existing connections drain naturally; no new requests are sent to the server.
  2. Rsync — files are pushed. Because the server is out of rotation, no users are affected by a partially updated file tree.
  3. Canary — the health check runs locally on the production server, not through HAProxy, so it tests the new code directly.
  4. Repoolenable server mediawiki/<server> returns the server to full rotation.
If rsync or the canary fails, the server is immediately repooled (even in failure) to avoid leaving it permanently depooled. The deploy is then aborted (unless --force is set).
Without --rollout, servers stay in the HAProxy pool throughout the deploy. The rsync is still atomic at the filesystem level (rsync writes to a temp path and renames), but there is a brief window where a mix of old and new files may be served. Only omit --rollout for low-risk config-only or emergency pushes.

Manual Pool Control from the TUI

You can manually depool or repool any server at any time from the dashboard by selecting it and pressing Ctrl+R:
  HAProxy pool action for mw-us-east-012:

  [D]  Depool (disable server)
  [R]  Repool (enable server)
  [N]  Cancel
This is useful for holding a server out of rotation while you investigate an issue, independent of the automated deploy cycle.

Notification Webhooks

mwdeploy posts deploy events to Discord and/or Slack webhooks configured under webhooks in /etc/mwdeploy/config.yaml. The following events generate notifications:
EventLevel
Deploy starting (with target server list)INFO
Deploy complete (with server count)INFO
Deploy finished with failures (with failed server list)ERROR
Individual canary failuresWARNING
Notifications are sent via direct HTTP POST to the webhook URL — no external dependency beyond network access. Suppress all notifications for a specific run with --no-log.

Post-Deploy Verification

After the TUI shows all servers as ✓ done, run a brief verification pass before considering the deploy complete.
1

Check the deploy summary

The bottom of the TUI dashboard shows a summary line:
✓ 4 succeeded   ✗ 0 failed   – 0 skipped/rolled back
Any non-zero failed count warrants investigation before declaring the deploy successful.
2

Spot-check a live page

Load a representative wiki page in a browser or with curl to verify MediaWiki is responding correctly on the live vhost:
curl -sI https://wikioasis.org/wiki/Main_Page | head -5
3

Verify all servers are back in the pool

Confirm that every application server is showing UP in HAProxy after the deploy:
salt 'proxy*' haproxy.status
4

Check Icinga2 for new alerts

Wait 2–3 minutes and review the monitoring dashboard for any new alerts that may have appeared as a result of the deploy. Pay attention to MediaWiki-specific checks (API, job queue, parser cache).

Rollback

mwdeploy does not have an automated rollback command. The rolled_back status shown in the TUI refers to a server that was repooled after a failure — it does not mean code was reverted. To roll back to the previous code state:
  1. On the staging server, revert the git change in staging:
    sudo -u www-data git -C /srv/mediawiki-staging/versions/1.45 reset --hard HEAD~1
    
  2. Re-run mwdeploy --rsync --servers all --rollout to push the previous content to all servers.
For extension or skin rollbacks, revert only the affected extension directory and use --extension <Name> to target just that path.
For fast partial rollbacks during an incident, use mwdeploy --rsync --servers <affected-server> --rollout --force to push to a single server quickly without waiting for canary prompts.

Build docs developers (and LLMs) love