Skip to main content
The Catalogs API provides endpoints to explore available data catalogs in BOOM, inspect their indexes, and retrieve sample documents. Catalogs represent MongoDB collections containing astronomical data.

List available catalogs

Retrieve a list of all available data catalogs in the system.
curl http://localhost:4000/catalogs \
  -H "Authorization: Bearer YOUR_TOKEN"

Query parameters

get_details
boolean
default:"false"
Whether to include detailed collection statistics for each catalog

Response

Returns an array of catalog objects. If get_details=true, each catalog includes MongoDB collection statistics.
name
string
Name of the catalog (MongoDB collection name)
details
object
Collection statistics (only when get_details=true). Includes document count, size, indexes, and other MongoDB collection metadata.
Catalogs cannot have names that start with system. or match protected collection names used internally by BOOM (e.g., users, filters, babamul_users).

Get catalog indexes

Retrieve index information for a specific catalog. Indexes improve query performance for frequently accessed fields.
curl http://localhost:4000/catalogs/ztf/indexes \
  -H "Authorization: Bearer YOUR_TOKEN"

Path parameters

catalog_name
string
required
Name of the catalog (case insensitive), e.g., ztf, gaia, panstarrs

Response

Returns an array of index objects with MongoDB index information.
name
string
Name of the index
key
object
Index key specification (fields and sort order)
v
number
Index version
unique
boolean
Whether the index enforces uniqueness
sparse
boolean
Whether the index is sparse (only indexes documents with the indexed field)

Error responses

404 Not Found - Catalog doesn’t exist:
Catalog ztf does not exist

Get sample documents

Retrieve random sample documents from a catalog to explore its schema and data structure.
curl "http://localhost:4000/catalogs/ztf/sample?size=5" \
  -H "Authorization: Bearer YOUR_TOKEN"

Path parameters

catalog_name
string
required
Name of the catalog (case insensitive), e.g., ztf, gaia, panstarrs

Query parameters

size
number
default:"1"
Number of random sample documents to return. Must be between 1 and 1000.

Response

Returns an array of random documents from the catalog. The exact structure depends on the catalog schema.

Error responses

404 Not Found - Catalog doesn’t exist:
Catalog ztf does not exist
400 Bad Request - Invalid size parameter:
Size must be between 1 and 1000

Use cases

Explore available data

Before querying a catalog, list all available catalogs to see what data sources are loaded:
import requests

response = requests.get(
    "http://localhost:4000/catalogs",
    params={"get_details": True},
    headers={"Authorization": f"Bearer {token}"}
)

for catalog in response.json()["data"]:
    print(f"Catalog: {catalog['name']}")
    print(f"  Documents: {catalog['details']['count']}")
    print(f"  Size: {catalog['details']['size'] / 1024 / 1024:.2f} MB")

Understand catalog schema

Retrieve sample documents to understand the structure and fields available for querying:
import requests
import json

# Get a few sample documents
response = requests.get(
    "http://localhost:4000/catalogs/ztf/sample",
    params={"size": 3},
    headers={"Authorization": f"Bearer {token}"}
)

samples = response.json()["data"]
print(json.dumps(samples[0], indent=2))  # Pretty print first sample

Check query optimization

Inspect indexes to understand which fields are optimized for querying:
import requests

response = requests.get(
    "http://localhost:4000/catalogs/ztf/indexes",
    headers={"Authorization": f"Bearer {token}"}
)

for index in response.json()["data"]:
    print(f"Index: {index['name']}")
    print(f"  Fields: {list(index['key'].keys())}")

Best practices

  1. Use details sparingly - Only request detailed catalog information when needed, as it requires additional database queries
  2. Check indexes - Before building complex queries, check which fields are indexed for optimal performance
  3. Sample before querying - Review sample documents to understand the catalog schema before writing queries
  4. Case insensitive - Catalog names are case insensitive, but use consistent casing for clarity
  5. Reasonable sample sizes - Keep sample sizes small (10-100 documents) to avoid unnecessary data transfer

Build docs developers (and LLMs) love