Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/mwalmsley/zoobot/llms.txt

Use this file to discover all available pages before exploring further.

Zoobot makes it easy to adapt state-of-the-art galaxy morphology classifiers to your own science questions. Trained on over 100 million Galaxy Zoo volunteer votes, Zoobot’s pretrained encoders generalize remarkably well — finetune to find rings, mergers, bars, or any other morphological feature with a fraction of the labels you’d need from scratch.

Quickstart

Finetune Zoobot to find ringed galaxies in under 20 lines of Python.

Installation

Install Zoobot locally with PyTorch and optional CUDA support.

Pretrained Models

Browse ConvNeXT, MaxViT, EfficientNet, and ResNet encoders on HuggingFace.

Finetuning Guide

Step-by-step walkthrough of the finetuning process with real examples.

Why Zoobot?

Zoobot is the foundation model for galaxy morphology. It has been trained on the GZ Evo dataset — 820k galaxy images with 100M+ volunteer votes from every major Galaxy Zoo campaign (GZ2, GZ Hubble, GZ CANDELS, GZ DECaLS/DESI, and GZ Cosmic Dawn). This breadth means the learned representations transfer exceptionally well to new surveys, instruments, and science questions.

Classification

Finetune for binary or multi-class morphology labels with cross-entropy loss.

Regression

Predict continuous morphology measurements like ellipticity or Sérsic index.

Vote Counts

Train on Galaxy Zoo decision trees with the Dirichlet-Multinomial loss.

Representations

Extract frozen feature vectors for similarity search and anomaly detection.

HuggingFace Models

Auto-download pretrained encoders directly in your training script.

Science Data

Access precomputed morphology catalogs and PCA representations.

Quick Example

Finetune Zoobot to find ringed galaxies using a small labelled dataset:
quickstart.py
import pandas as pd
from galaxy_datasets.pytorch.galaxy_datamodule import CatalogDataModule
from zoobot.pytorch.training import finetune

# CSV with 'ring' column (0 or 1) and 'file_loc' column (path to image)
labelled_df = pd.read_csv('/your/path/labelled_galaxies.csv')

datamodule = CatalogDataModule(
    label_cols=['ring'],
    catalog=labelled_df,
    batch_size=32
)

# Load pretrained Zoobot encoder from HuggingFace
model = finetune.FinetuneableZoobotClassifier(
    name='hf_hub:mwalmsley/zoobot-encoder-convnext_nano',
    num_classes=2
)

# Finetune to find rings
trainer = finetune.get_trainer(save_dir='./results')
trainer.fit(model, datamodule)

Get Started

1

Install Zoobot

Install the package with PyTorch support: pip install zoobot[pytorch]
2

Choose a pretrained model

Browse the pretrained models page and pick an encoder. ConvNeXT-Nano is recommended for most users.
3

Prepare your catalog

Create a CSV with id_str, file_loc, and your label columns. See the loading data guide.
4

Finetune and predict

Run the quickstart or follow the finetuning guide for a detailed walkthrough.
Zoobot is deployed on the Euclid space mission pipeline to produce morphology catalogs for the OU-MER data releases. The same pretrained models you use locally power one of the most ambitious galaxy surveys ever conducted.

Build docs developers (and LLMs) love