Zoobot provides a collection of pretrained encoder weights across a wide range of architectures — from lightweight ResNets to large-scale transformer-hybrid MaxViT models. All weights are hosted on HuggingFace and can be loaded directly into your training workflow with a single line of code. Every model was pretrained on the GZ Evo dataset and learns a rich, transferable representation of galaxy morphology that you can finetune for your own tasks.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/mwalmsley/zoobot/llms.txt
Use this file to discover all available pages before exploring further.
Loading Models
The simplest way to load a pretrained Zoobot model is via the finetuning interface. Pass the HuggingFace model ID with thehf_hub: prefix and Zoobot will automatically download the encoder weights:
timm encoder directly, for example to extract representations or plug the backbone into a custom architecture:
Available Models
Zoobot includes pretrained weights for the following architectures. Test loss is measured on a held-out split of the GZ Evo dataset — lower is better.| Architecture | Parameters | Test Loss | HuggingFace |
|---|---|---|---|
| ConvNeXT-Pico | 9.1M | 19.33 | Link |
| ConvNeXT-Nano | 15.6M | 19.23 | Link |
| ConvNeXT-Tiny | 44.6M | 19.08 | Link |
| ConvNeXT-Small | 58.5M | 19.06 | Link |
| ConvNeXT-Base | 88.6M | 19.05 (best) | Link |
| ConvNeXT-Large | 197.8M | 19.09 | Link |
| MaxViT-Tiny | 29.1M | 19.22 | Link |
| MaxViT-Small | 64.9M | 19.20 | Link |
| MaxViT-Base | 124.5M | 19.09 | Link |
| MaxViT-Large | 211.8M | 19.18 | Link |
| EfficientNetB0 | 5.33M | 19.48 | WIP |
| EfficientNetV2-S | 48.3M | 19.33 | WIP |
| ResNet18 | 11.7M | 19.83 | Link |
| ResNet50 | 25.6M | 19.43 | Link |
| ResNet101 | 44.5M | 19.37 | Link |
Missing a model you need? Reach out! There’s a good chance we can train any model supported by timm.
Which Model Should I Use?
For most users, start with ConvNeXT-Nano. It delivers good performance while still being small enough to train on a single gaming GPU, so you can iterate quickly without cluster access. For maximum performance, consider upgrading to ConvNeXT-Small or ConvNeXT-Base. MaxViT-Base is another excellent option — it achieves competitive results and incorporates an attention mechanism that may be of scientific interest. All of these larger models require cluster-grade GPUs (e.g. V100 or above). For benchmarks or reproducibility:- EfficientNetB0 is equivalent to the model used in the original GZ DECaLS and GZ DESI papers.
- ResNet18 and ResNet50 are well-established baselines useful for comparison or as backbones in other frameworks (e.g. object detection).
Greyscale Models
Available from Zoobot v2.0.1 onwards. Greyscale (single-channel) encoder variants are available at the zoobot-encoders-greyscale HuggingFace collection. Load them with the
hf_hub:mwalmsley/zoobot-encoder-greyscale-convnext_nano naming pattern.Training Data
All Zoobot encoders are trained on the GZ Evo dataset, which aggregates data from every major Galaxy Zoo campaign:- 820,000 galaxy images
- 100 million+ volunteer classification votes
- Campaigns covered: GZ2, GZ UKIDSS, GZ Hubble, GZ CANDELS, GZ DECaLS/DESI, and GZ Cosmic Dawn (HSC)