Zoobot is a PyTorch deep learning library built for galaxy morphology classification and analysis. Trained on over 107 million volunteer votes collected through the Galaxy Zoo citizen science project, Zoobot’s pretrained encoders capture rich representations of galaxy structure that transfer remarkably well to new tasks — even when you have only a few hundred labelled examples. Whether you need to identify rings, bars, merging pairs, or predict continuous morphological quantities, Zoobot gives you a production-quality starting point without training from scratch.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/mwalmsley/zoobot/llms.txt
Use this file to discover all available pages before exploring further.
What Problem Does Zoobot Solve?
Labelling galaxy images is expensive. Professional astronomers and citizen scientists can only inspect so many images by hand, yet modern sky surveys like DESI now catalogue tens of millions of galaxies. Zoobot addresses this by providing encoders that have already learned powerful, general-purpose galaxy representations. You supply a small labelled dataset; Zoobot supplies decades of collective morphological knowledge, baked into pretrained weights.Zoobot 2.0 introduces larger, more capable pretrained models backed by the GZ Evo dataset, a unified collection of 823k galaxy images and 107M volunteer labels. See the Scaling Laws for Galaxy Images paper for details.
How Zoobot Works
Zoobot’s encoders are pretrained on the GZ Evo dataset, which aggregates volunteer classifications from five major Galaxy Zoo campaigns:| Survey | Approximate Size |
|---|---|
| Galaxy Zoo 2 (GZ2) | ~240k galaxies |
| Galaxy Zoo Hubble (GZ Hubble) | ~100k galaxies |
| Galaxy Zoo CANDELS (GZ CANDELS) | ~50k galaxies |
| Galaxy Zoo DECaLS / DESI (GZD) | ~310k galaxies |
| Galaxy Zoo Cosmic Dawn (HSC H2O) | ~120k galaxies |
Finetuning Modes
Zoobot exposes three ready-to-use finetuning classes, all inheriting from a commonFinetuneableZoobotAbstract base:
FinetuneableZoobotClassifier
Multi-class or binary classification with cross-entropy loss. Ideal for tasks like ring/not-ring detection or morphological type labelling. Reports accuracy during training.
FinetuneableZoobotRegressor
Single-value regression with MSE or MAE loss. Useful for predicting continuous quantities such as Sérsic index, ellipticity, or concentration. Reports RMSE during training.
FinetuneableZoobotTree
Vote-count / decision-tree prediction using the Dirichlet-Multinomial loss introduced in GZ DECaLS. Designed for reproducing full Galaxy Zoo answer distributions.
learning_rate, layer_decay, weight_decay, head_dropout_prob, training_mode — and can be loaded from a HuggingFace Hub name, a local checkpoint, or an in-memory PyTorch module.
Pretrained Architectures
Zoobot 2.0 ships pretrained weights for several modern CNN and transformer architectures, all available on HuggingFace:| Architecture | Example Hub Name |
|---|---|
| ConvNeXT (nano / tiny / small / base) | hf_hub:mwalmsley/zoobot-encoder-convnext_nano |
| MaxViT | hf_hub:mwalmsley/zoobot-encoder-maxvit_tiny_tf_224 |
| EfficientNetV2 | hf_hub:mwalmsley/zoobot-encoder-efficientnet_b0 |
| ResNet | hf_hub:mwalmsley/zoobot-encoder-resnet50 |
hf_hub:mwalmsley/zoobot-encoder-greyscale-convnext_nano family.
Scientific Impact
Zoobot is not a research prototype — it is actively deployed in production astronomical pipelines:- GZ DECaLS — classified detailed morphologies for 314,000 galaxies in the DESI Legacy Imaging Surveys
- GZ DESI — scaled to 8.7 million galaxies, one of the largest morphology catalogues ever produced
- Euclid pipeline — Zoobot powers the OU-MER morphology catalogue for ESA’s Euclid space mission (Q1 data, 2025), including strong-lensing discovery, bar fraction measurements, and dwarf galaxy census studies
Where to Go Next
Quickstart
Finetune a pretrained model to find ringed galaxies in under 20 lines of Python.
Pretrained Models
Browse all available encoder architectures and their HuggingFace Hub names.
Finetuning Guide
Deep-dive into finetuning options: training modes, schedulers, and class weights.
API Reference
Full reference for FinetuneableZoobotClassifier, Regressor, Tree, and utilities.