Skip to main content

Quick start

This guide will walk you through processing your first astronomical alerts with BOOM using archived ZTF data.
This quickstart assumes you have completed the installation guide and have all services running via Docker Compose.

Overview

BOOM’s alert processing pipeline consists of three main components:
  1. Kafka Producer: Publishes alerts to Kafka topics (for testing with archived data)
  2. Kafka Consumer: Reads alerts from Kafka and queues them in Redis/Valkey
  3. Scheduler: Manages workers that process, enrich, and filter alerts
Let’s run through a complete example using public ZTF alerts.

Verify services are running

First, ensure all Docker services are running:
docker ps
You should see containers for:
  • mongo (MongoDB)
  • valkey (Redis fork)
  • broker (Kafka)
  • prometheus (Metrics)

Step 1: Produce test alerts

BOOM can process alerts from the ZTF public archive. Let’s produce alerts from June 17, 2024:
cargo run --release --bin kafka_producer ztf 20240617 public
The producer will download archived ZTF alerts and publish them to a Kafka topic. Leave this running in your terminal.

Producer command syntax

The kafka_producer binary accepts the following arguments:
cargo run --release --bin kafka_producer <SURVEY> [DATE] [PROGRAMID]
  • <SURVEY>: Survey name (e.g., ztf, lsst, decam)
  • [DATE]: Observation date in YYYYMMDD format
  • [PROGRAMID]: Program identifier (e.g., public for ZTF public alerts)
To see all available options:
cargo run --release --bin kafka_producer -- --help

Clear Kafka topic (optional)

If you want to clear the Kafka topic before starting:
docker exec -it broker /opt/kafka/bin/kafka-topics.sh \
  --bootstrap-server broker:9092 \
  --delete \
  --topic ztf_20240617_public

Step 2: Consume alerts

In a new terminal, start the Kafka consumer to read alerts and queue them for processing:
cargo run --release --bin kafka_consumer ztf 20240617 --programids public
The consumer will:
  1. Read alerts from the Kafka topic
  2. Transfer them to Redis/Valkey in-memory queues
  3. Make them available for the processing pipeline

Consumer command syntax

cargo run --release --bin kafka_consumer <SURVEY> [DATE] --programids [PROGRAMIDS]
  • <SURVEY>: Survey name (must match producer)
  • [DATE]: Observation date (must match producer)
  • --programids: Comma-separated list of program IDs

Step 3: Start the scheduler

In a third terminal, start the scheduler to process alerts:
cargo run --release --bin scheduler ztf
The scheduler will:
  1. Spawn alert ingestion workers to format and enrich alerts
  2. Spawn enrichment workers to run classifiers
  3. Spawn filter workers to execute user-defined filters
  4. Manage worker lifecycle and monitor system health

What you’ll see

The scheduler prints diagnostic messages:
1

Alert ingestion

Processed alert with candid: 2479571234567, queueing for classification
Alert workers are processing alerts and queuing them for enrichment.
2

Enrichment

received alerts len: 50
Enrichment workers are processing batches of alerts and running classifiers.
3

Heartbeat

heart beat (MAIN)
The scheduler is running and managing workers correctly.
You won’t see filter worker output initially because no filters are defined. Filter management via the API is coming in a future release.

Step 4: Monitor metrics

BOOM exposes Prometheus metrics for monitoring pipeline performance. Open your browser to:
http://localhost:9090

Useful metrics queries

Kafka consumer

View total alerts processed and instantaneous throughputView queries

Alert workers

Monitor alert ingestion worker throughput and processing timeView queries

Enrichment workers

Track enrichment worker performance and batch sizesView queries

Filter workers

Monitor filter execution and alert matchingView queries

Configuration

BOOM’s configuration is defined in config.yaml and can be overridden with environment variables.

Worker configuration

The number of workers for each survey is configured in config.yaml:
config.yaml
workers:
  ztf:
    command_interval: 500
    alert:
      n_workers: 3
    enrichment:
      n_workers: 3
    filter:
      n_workers: 3

Crossmatch catalogs

BOOM enriches alerts by crossmatching with astronomical catalogs:
config.yaml
crossmatch:
  ztf:
    - catalog: PS1_DR1
      radius: 2.0  # arcseconds
      use_distance: false
      projection:
        _id: 1
        gMeanPSFMag: 1
        rMeanPSFMag: 1
        iMeanPSFMag: 1
    - catalog: Gaia_DR3
      radius: 2.0
      use_distance: false
      projection:
        _id: 1
        parallax: 1
        phot_g_mean_mag: 1
    - catalog: NED
      radius: 300.0
      use_distance: true
      distance_key: "z"
      distance_max: 30.0
Supported catalogs:
  • PS1_DR1: Pan-STARRS1 Data Release 1
  • Gaia_DR3: Gaia Data Release 3
  • milliquas_v6: Million Quasars Catalog
  • NED: NASA/IPAC Extragalactic Database
  • LSPSC: Legacy Survey Proper Star Catalog

Database configuration

Configure MongoDB connection:
config.yaml
database:
  host: localhost
  port: 27017
  name: boom
  max_pool_size: 200
  username: mongoadmin
  password: ""  # Set via BOOM_DATABASE__PASSWORD

Kafka configuration

Configure Kafka consumers for different surveys:
config.yaml
kafka:
  consumer:
    ztf:
      server: "localhost:9092"
      group_id: ""  # Set via BOOM_KAFKA__CONSUMER__ZTF__GROUP_ID
    lsst:
      server: "usdf-alert-stream-dev.lsst.cloud:9094"
      schema_registry: "https://usdf-alert-schemas-dev.slac.stanford.edu"
      group_id: ""  # Set via BOOM_KAFKA__CONSUMER__LSST__GROUP_ID
      username: ""  # Set via BOOM_KAFKA__CONSUMER__LSST__USERNAME
      password: ""  # Set via BOOM_KAFKA__CONSUMER__LSST__PASSWORD

Logging

BOOM uses the RUST_LOG environment variable to control logging levels:
RUST_LOG=debug cargo run --release --bin scheduler ztf

Log levels

  • trace: Most verbose, includes all events
  • debug: Debugging information
  • info: Informational messages (default)
  • warn: Warning messages
  • error: Error messages only
  • off: Disable logging

Per-crate logging

Control logging for specific crates:
# INFO for all crates, ERROR for ort (ONNX Runtime)
RUST_LOG=info,ort=error cargo run --release --bin scheduler ztf

# DEBUG for all, WARN for ort
RUST_LOG=debug,ort=warn cargo run --release --bin scheduler ztf

Span events

Enable span lifecycle events for profiling:
BOOM_SPAN_EVENTS=new,close cargo run --release --bin scheduler ztf
Available span events:
  • new: Span creation
  • enter: Entering a span
  • exit: Exiting a span
  • close: Span closure (includes execution time)
  • active: Active span changes
  • full: All span events
  • none: No span events

Complete example

RUST_LOG=debug,ort=warn BOOM_SPAN_EVENTS=new,close \
  cargo run --release --bin scheduler ztf

Stop BOOM

To stop BOOM gracefully:
1

Stop the scheduler

Press CTRL+C in the scheduler terminal. The scheduler will send interrupt signals to all workers.
2

Stop the consumer

Press CTRL+C in the consumer terminal.
3

Stop the producer

Press CTRL+C in the producer terminal.
4

Stop Docker services

docker compose down
Graceful shutdown is still in development. You may see error handling in the logs during shutdown.

Next steps

Architecture

Learn more about BOOM’s modular architecture and worker types

Configuration

Explore advanced configuration options for production deployments

API documentation

Explore the HTTP API for querying alerts and managing filters

Contributing

Contribute to BOOM development on GitHub

Troubleshooting

No alerts being processed

If the scheduler shows no activity:
  1. Verify the producer is running and publishing alerts
  2. Check the consumer is reading from the correct Kafka topic
  3. Ensure the survey name and date match across producer, consumer, and scheduler
  4. Check Docker services are healthy: docker ps

Connection errors

If you see connection errors:
  1. Verify MongoDB is running: docker ps | grep mongo
  2. Verify Redis/Valkey is running: docker ps | grep valkey
  3. Verify Kafka is running: docker ps | grep broker
  4. Check environment variables in .env match config.yaml

High memory usage

If BOOM consumes too much memory:
  1. Reduce the number of workers in config.yaml
  2. Reduce max_pool_size for MongoDB
  3. Monitor metrics in Prometheus to identify bottlenecks

Slow processing

If alert processing is slow:
  1. Increase the number of workers in config.yaml
  2. Ensure you’re running with --release flag for optimized builds
  3. Check system resources (CPU, memory, disk I/O)
  4. Review Prometheus metrics to identify bottlenecks

Build docs developers (and LLMs) love