Alerts

Overview

Alerts evaluate live server data and trigger actions when conditions are met. In xyOps, an alert is defined once (the “definition”) and may fire many times across servers (each firing is an “invocation”).

Alerts are evaluated every minute on the conductor using the most recent ServerMonitorData collected from each server.

Use alerts to:

Detect system conditions (high CPU, low memory, disk full, job spikes)
Notify teams via email, webhooks, or channels
Attach context via snapshots
Open tickets automatically
Run jobs in response to conditions
Limit or abort jobs on affected servers

Concepts

Definition

object

The configuration that specifies the trigger condition and actions

Invocation

instance

A single firing instance against a server. Stored in the database and visible in the Alerts view.

Evaluation cadence

schedule

Once per minute per server, alongside monitor sampling

Scope

groups

By server group. Leave blank to apply to all groups.

Warm-up / cool-down

samples

Optionally require N consecutive true evaluations before firing, and N consecutive false evaluations before clearing

Actions

array

Execute on alert fired and/or cleared. Can be defined on the alert, augmented by groups, and extended with universal defaults.

Job control

toggles

Optionally prevent new jobs from launching while active, or abort all running jobs when the alert fires

How Alerts Are Evaluated

Per incoming minute of server data:

Match Scope

xyOps evaluates each enabled alert definition whose group scope matches the server

Evaluate Expression

The alert’s expression (JavaScript format) runs against the current ServerMonitorData snapshot

Sample Counter

If true, the alert’s internal sample counter increments. If false and previously incremented, the counter decrements toward zero.

Fire or Clear

When the counter first reaches max samples, an invocation is created and actions run. When the counter subsequently returns to zero, the invocation is cleared and cleared actions run.

Code Implementation

Alert evaluation from /workspace/source/lib/monitor.js:292-354:

monitor.js:292-354

alert_defs.forEach( function(alert_def) {
    if (!alert_def.enabled) return;
    
    var global_id = server.id + '-' + alert_def.id;
    
    // only check alerts assigned to current server
    if (alert_def.groups && alert_def.groups.length && !Tools.includesAny(alert_def.groups, params.groups)) return;
    
    // evaluate alert expression, default to false
    var exp = self.expressionCache[ alert_def.id ];
    var result = false;
    try { 
        result = exp.evalSync( params.data ); 
    }
    catch (err) {
        result = false;
    }
    
    if (result) {
        // alert is active (but may need multiple samples before triggering)
        if (!self.warmAlerts[global_id]) self.warmAlerts[global_id] = {
            id: Tools.generateShortID('a'),
            server: server.id,
            alert: alert_def.id,
            date: params.date,
            exp: alert_def.expression,
            message: '',
            count: 0,
            modified: params.date,
            notified: false
        };
        var warm_alert = self.warmAlerts[global_id];
        
        // recompute message every time, as it may change
        warm_alert.message = self.messageSub( alert_def.message, params.data );
        
        // increase count, stop at def samples
        warm_alert.count++;
        if (warm_alert.count > alert_def.samples) warm_alert.count = alert_def.samples;
    }
    else if (self.warmAlerts[global_id]) {
        // if previous alert, decrement count
        var warm_alert = self.warmAlerts[global_id];
        
        // decrease count, stop at 0
        warm_alert.count--;
        if (warm_alert.count < 0) warm_alert.count = 0;
    }
} );

Expressions compile ahead of time; syntax errors are rejected at create/update time. The alert message is re-evaluated each minute while active, so macros reflect current server values.

Alert Expressions

An alert expression is evaluated using xyOps Expression Format (JavaScript-style) with the current ServerMonitorData as context.

Common Entry Points

cpu - CPU stats and hardware information
memory - Total/available memory
load - 1/5/15 minute load averages
monitors - Values from configured monitors (absolute values)
deltas - Computed deltas for counter-style monitors since last sample
jobs - Running job count for the server

Examples

// High load: fire if 1-min load >= CPU cores + 1
monitors.load_avg >= (cpu.cores + 1)

// Low memory: less than 5% available
memory.available < (memory.total * 0.05)

// High I/O wait
monitors.io_wait >= 75

// Disk full
monitors.disk_usage_root >= 90

// High active jobs
monitors.active_jobs >= 50

// Delta example (for counter-style monitors)
deltas.os_bytes_out_sec >= 33554432

Helper Functions

Available in expressions and message macros:

min(a, b), max(a, b) - Math functions
integer(x), float(x) - Type conversion
bytes(x) - Render human-readable bytes
number(x) - Render localized numbers
pct(x) - Render a percentage
stringify(obj) - JSON stringify a value
find(array, key, substr) - Filter array items

Use monitors.MONITORID for absolute values and deltas.MONITORID for per-minute rates. Guard against missing values with sensible defaults: integer(monitors.foo || 0) > 10

Alert Messages

The alert message is a string with {{ ... }} macros evaluated against the same ServerMonitorData context used for expressions.

CPU load average is too high: {{float(monitors.load_avg)}} ({{cpu.cores}} CPU cores)

Less than 5% of total memory is available ({{bytes(memory.available)}} of {{bytes(memory.total)}})

Disk I/O wait is too high: {{pct(monitors.io_wait)}}

Root filesystem is {{pct(monitors.disk_usage_root)}} full.

Additional Variables in Actions

def - The alert definition object (def.title, def.notes, etc.)
alert - The alert invocation object (alert.id, alert.message, etc.)
nice_* - Friendly strings for host, IP, CPU, OS, memory, uptime, groups, notes
links - server_url and alert_url direct links

Creating and Editing Alerts

Click on “Alert Setup” in the sidebar:

Title

string

required

Display name for the alert

Status

boolean

Enable/disable notifications and actions

Icon

string

Optional Material Design Icon for the alert

Server Groups

array

One or more groups where the alert applies (blank for all groups)

Expression

string

required

Trigger condition evaluated each minute. Use the Server Data Explorer to discover paths.

Message

string

required

Text with {{macros}} for dynamic context. Evaluated on fire and each minute while active.

Samples

integer

default:"1"

Consecutive minutes that must evaluate true to fire; also used as cool-down to clear

Overlay

string

Optional monitor to overlay alert annotations on charts

Job Limit

boolean

While active, prevent new jobs from starting on the server

Job Abort

boolean

When fired, abort all running jobs on the server

Alert Actions

array

Optional actions to run on alert_new and/or alert_cleared

Notes

string

Optional text included in emails and other notifications

Use the “Test…” button to evaluate the current Expression and Message against a selected live server. The dialog shows whether it would trigger right now and previews the computed message.

Actions on Fire and Clear

When an alert fires (alert_new) and when it clears (alert_cleared), xyOps executes actions in parallel from three sources:

Alert actions

Configured on the alert definition itself

Group actions

Each matching server group can contribute actions

Universal actions

From config.json → alert_universal_actions (defaults to a snapshot on alert_new)

Universal Actions Configuration

config.json:58-65

"alert_universal_actions": [
    {
        "enabled": true,
        "hidden": true,
        "condition": "alert_new",
        "type": "snapshot"
    }
]

Supported Action Types

Send to specified users and/or custom addresses

Channel

Fire a notification channel (a preset bundle like users, web hooks, etc.)

Run Job

Start a job by event with optional parameters

Create Ticket

Open or update a ticket tied to the alert

Web Hook

Fire a preconfigured outbound web hook with templated payload

Plugin

Run a custom plugin with arguments

Snapshot

Capture a point-in-time server snapshot (included by default via universal actions)

Job Control During Alerts

Limit Jobs

boolean

While the alert is active on a server, that server is excluded from job scheduling (prevents new jobs from launching). Workflow parent jobs are exempt from this restriction.

Abort Jobs

boolean

When the alert fires, all running jobs on the affected server are aborted immediately.

Code Example: Job Abortion

monitor.js:442-447

// optionally abort all jobs on server
if (alert_def.abort_jobs) {
    jobs.forEach( function(job) {
        self.abortJob(job, "Alert Fired: " + alert_def.title);
    } );
}

Default Alert Examples

Alert Title	Expression	Message
High CPU Load	`monitors.load_avg >= (cpu.cores + 1)`	CPU load average is too high: `{{float(monitors.load_avg)}}` (`{{cpu.cores}}` CPU cores)
Low Memory	`memory.available < (memory.total * 0.05)`	Less than 5% of total memory is available (`{{bytes(memory.available)}}` of `{{bytes(memory.total)}}`)
High I/O Wait	`monitors.io_wait >= 75`	Disk I/O wait is too high: `{{pct(monitors.io_wait)}}`
Disk Full	`monitors.disk_usage_root >= 90`	Root filesystem is `{{pct(monitors.disk_usage_root)}}` full.
High Active Jobs	`monitors.active_jobs >= 50`	Active job count is too high: `{{number(monitors.active_jobs)}}`

Best Practices

Tune samples to balance noise and responsiveness. For spiky metrics, require multiple samples (e.g., 3-5 consecutive minutes).

Prefer relative thresholds when available (e.g., compare load to cpu.cores)
Use bytes()/pct()/number() to produce readable messages in notifications
Overlay alerts on monitors users already watch to provide context
Use group-level alert actions for standard responses (e.g., page an on-call channel)
Keep per-alert actions focused on specifics
Consider limiting jobs for conditions that would degrade runtime reliability (disk full, high I/O wait)

Viewing and Searching Alerts

Active alerts: Shown in the header counter and the Alerts tab with evaluated message, server context, snapshot link and related jobs/tickets
Timelines: If monitor_id is set, alert annotations appear on the corresponding monitor chart
History search: Search for historical alerts on the Alerts page

get_alerts - List all alert definitions
get_alert - Fetch a single definition by ID
create_alert / update_alert / delete_alert - Manage definitions
test_alert - Compile and evaluate an expression/message against a server
search_alerts - Query historical and active alert invocations

Get Started

Core Concepts

Monitoring & Alerts

Organization

Operations

Plugins & Extensions

Advanced

Overview

Concepts

How Alerts Are Evaluated

Code Implementation

Alert Expressions

Common Entry Points

Examples

Helper Functions

Alert Messages

Additional Variables in Actions

Creating and Editing Alerts

Actions on Fire and Clear

Universal Actions Configuration

Supported Action Types

Job Control During Alerts

Code Example: Job Abortion

Default Alert Examples

Best Practices

Viewing and Searching Alerts

Build docs developers (and LLMs) love

Get Started

Core Concepts

Monitoring & Alerts

Organization

Operations

Plugins & Extensions

Advanced

Documentation Index

​Overview

​Concepts

​How Alerts Are Evaluated

​Code Implementation

​Alert Expressions

​Common Entry Points

​Examples

​Helper Functions

​Alert Messages

​Additional Variables in Actions

​Creating and Editing Alerts

​Actions on Fire and Clear

​Universal Actions Configuration

​Supported Action Types

​Job Control During Alerts

​Code Example: Job Abortion

​Default Alert Examples

​Best Practices

​Viewing and Searching Alerts

​Related APIs

Build docs developers (and LLMs) love

Overview

Concepts

How Alerts Are Evaluated

Code Implementation

Alert Expressions

Common Entry Points

Examples

Helper Functions

Alert Messages

Additional Variables in Actions

Creating and Editing Alerts

Actions on Fire and Clear

Universal Actions Configuration

Supported Action Types

Job Control During Alerts

Code Example: Job Abortion

Default Alert Examples

Best Practices

Viewing and Searching Alerts

Related APIs