Advanced Finetuning: Custom Tasks and Architectures

Zoobot provides FinetuneableZoobotClassifier, FinetuneableZoobotRegressor, and FinetuneableZoobotTree for the most common finetuning scenarios. But what if your task doesn’t fit neatly into classification, regression, or vote-count prediction? This guide covers three advanced integration patterns:

Using Zoobot’s encoder directly in your own pipeline
Extracting frozen galaxy representations at scale
Subclassing FinetuneableZoobotAbstract to implement a custom head and loss

Using Zoobot’s Encoder Directly

Because Zoobot encoders are standard timm models, you can plug them into any PyTorch pipeline.

Method 1: Via a FinetuneableZoobot Class

Load any FinetuneableZoobot class and access its .encoder attribute:

from zoobot.pytorch.training.finetune import FinetuneableZoobotClassifier

model = FinetuneableZoobotClassifier(name='hf_hub:mwalmsley/zoobot-encoder-convnext_nano')
encoder = model.encoder

Method 2: Via `timm` Directly

Because Zoobot encoders are published as timm-compatible HuggingFace Hub models, you can load them without Zoobot at all:

import timm

encoder = timm.create_model(
    'hf_hub:mwalmsley/zoobot-encoder-convnext_nano',
    pretrained=True,
    num_classes=0  # removes the classification head, returns raw features
)

You can then use encoder like any other timm model — wrap it in a custom head, combine it with other networks, or pass it to a contrastive learning framework.

If you only need frozen feature vectors without any fine-tuning, use the Extracting Frozen Representations approach below instead — it handles batching and I/O boilerplate automatically.

Extracting Frozen Representations

Once you have a pretrained or finetuned Zoobot encoder, you can store its output vectors as fixed-dimensional features for downstream tasks like similarity search, anomaly detection, or clustering. Zoobot includes ZoobotEncoder, a PyTorch Lightning LightningModule wrapper that lets you pass the encoder to the same predict_on_catalog.predict() utility used for full model inference — handling batching, looping, and file I/O automatically.

from zoobot.pytorch.training.representations import ZoobotEncoder
from zoobot.pytorch.predictions import predict_on_catalog

# Wrap the encoder in a LightningModule
lightning_encoder = ZoobotEncoder.load_from_name(
    'hf_hub:mwalmsley/zoobot-encoder-convnext_nano'
)

# Run inference exactly as you would with a full Zoobot model
predict_on_catalog.predict(
    catalog,
    lightning_encoder,
    n_samples=1,
    label_cols=label_cols,
    save_loc=save_loc,
    datamodule_kwargs=datamodule_kwargs,
    trainer_kwargs=trainer_kwargs
)

See zoobot/pytorch/examples/representations for a complete working example.

Dimensionality Reduction

Zoobot representations are typically high-dimensional (e.g. 1280 for EfficientNetB0) and therefore highly redundant. We recommend using PCA to compress them to a more manageable size (e.g. 40 dimensions) while retaining most of the information. This was the approach taken in the Practical Morphology Tools paper.

Pre-calculated representations for all DESI galaxies are available — see the Science Data page. HSC representations are coming soon.

Subclassing FinetuneableZoobotAbstract

For tasks that don’t fit the built-in classes (for example, multi-output regression, ordinal classification, or custom loss functions), you can subclass FinetuneableZoobotAbstract and plug in your own head and loss. Your subclass must:

Set self.head — a torch.nn.Module that maps encoder features to outputs.
Set self.loss — a callable with signature loss(y_pred, y).
Implement batch_to_supervised_tuple(self, batch) — extracts (x, y) from a batch dict.

Example: Custom Regression Head

Imagine you want to finetune Zoobot on a regression task with a custom loss. Here’s how you’d implement it:

import torch
from zoobot.pytorch.training.finetune import FinetuneableZoobotAbstract


class FinetuneableZoobotCustomRegression(FinetuneableZoobotAbstract):

    def __init__(
        self,
        foo,
        **super_kwargs
    ):
        super().__init__(**super_kwargs)

        self.foo = foo
        self.loss = torch.nn.SomeCrazyLoss()
        self.head = torch.nn.Sequential(my_crazy_head)

    # batch_to_supervised_tuple must be implemented
    def batch_to_supervised_tuple(self, batch):
        return batch['image'], batch['my_label']

# see zoobot/pytorch/training/finetune.py for more examples and all methods required

Once defined, you can train this custom class exactly like any built-in FinetuneableZoobot class:

from zoobot.pytorch.training import finetune

model = FinetuneableZoobotCustomRegression(
    foo='bar',
    name='hf_hub:mwalmsley/zoobot-encoder-convnext_nano',
    learning_rate=1e-4
)

trainer = finetune.get_trainer(save_dir, accelerator='gpu', max_epochs=100)
trainer.fit(model, datamodule)

All the inherited machinery — AdamW optimization, layer decay, early stopping, checkpointing, and scheduler support — works out of the box.

Look at the source code of FinetuneableZoobotClassifier and FinetuneableZoobotRegressor in zoobot/pytorch/training/finetune.py for concrete examples of the full interface you can override.

Get Started

Finetuning Guide

Pretrained Models

Training from Scratch

Advanced Finetuning: Custom Tasks and Architectures

Using Zoobot’s Encoder Directly

Method 1: Via a FinetuneableZoobot Class

Method 2: Via `timm` Directly

Extracting Frozen Representations

Dimensionality Reduction

Subclassing FinetuneableZoobotAbstract

Example: Custom Regression Head

Build docs developers (and LLMs) love

Get Started

Finetuning Guide

Pretrained Models

Training from Scratch

Documentation Index

​Using Zoobot’s Encoder Directly

​Method 1: Via a FinetuneableZoobot Class

​Method 2: Via timm Directly

​Extracting Frozen Representations

​Dimensionality Reduction

​Subclassing FinetuneableZoobotAbstract

​Example: Custom Regression Head

Build docs developers (and LLMs) love

Using Zoobot’s Encoder Directly

Method 1: Via a FinetuneableZoobot Class

Method 2: Via `timm` Directly

Extracting Frozen Representations

Dimensionality Reduction

Subclassing FinetuneableZoobotAbstract

Example: Custom Regression Head