Documentation Index
Fetch the complete documentation index at: https://mintlify.com/mlfoundations/open_clip/llms.txt
Use this file to discover all available pages before exploring further.
OpenCLIP provides tools to easily upload your trained models to the Hugging Face Hub. This makes your models discoverable, shareable, and easy to load for others using the OpenCLIP library.
Overview
The push_to_hf_hub module provides:
- Command-line tool for uploading models
- Python API for programmatic uploads
- Automatic configuration file generation
- Model card creation
- Support for safetensors format
Installation
Ensure you have the required dependencies:
pip install huggingface_hub safetensors
Login to Hugging Face:
Or provide a token directly in the command.
Command-Line Usage
Use the push_to_hf_hub module as a command-line tool:
python -m open_clip.push_to_hf_hub \
--model MODEL_NAME \
--pretrained PRETRAINED_PATH_OR_TAG \
--repo-id YOUR_HF_USERNAME/MODEL_REPO_NAME
Required Parameters
--model: Name of the model architecture (e.g., ViT-B-32, ViT-L-14)
--pretrained: Path to checkpoint file or pretrained tag
--repo-id: Hugging Face Hub repository ID (format: username/repo-name)
Optional Parameters
--precision: Model precision (fp32, fp16, bf16) - default: fp32
--image-mean: Override image mean values for preprocessing
--image-std: Override image std values for preprocessing
--image-interpolation: Image resize interpolation method (bicubic, bilinear)
--image-resize-mode: Image resize mode (shortest, longest, squash)
--hf-tokenizer-self: Make tokenizer config point to the uploaded model itself
Examples
Example 1: Upload Trained Model
Upload a model you trained locally:
python -m open_clip.push_to_hf_hub \
--model ViT-B-32 \
--pretrained /path/to/checkpoints/epoch_32.pt \
--repo-id myusername/my-clip-model
Example 2: Upload with Custom Preprocessing
Upload a model with custom preprocessing parameters:
python -m open_clip.push_to_hf_hub \
--model ViT-L-14 \
--pretrained /path/to/checkpoint.pt \
--repo-id myusername/vitl14-custom \
--image-mean 0.5 0.5 0.5 \
--image-std 0.5 0.5 0.5 \
--image-interpolation bicubic
Example 3: Re-upload Existing Model
Re-upload an existing OpenCLIP model to your Hub:
python -m open_clip.push_to_hf_hub \
--model convnext_large_d_320 \
--pretrained laion2b_s29b_b131k_ft \
--repo-id myusername/CLIP-convnext_large_d_320
This example is from the README - uploading a ConvNeXt model trained on LAION-2B.
Example 4: Upload with Self-Referencing Tokenizer
Upload a model with custom tokenizer that references itself:
python -m open_clip.push_to_hf_hub \
--model roberta-ViT-B-32 \
--pretrained /path/to/checkpoint.pt \
--repo-id myusername/roberta-clip \
--hf-tokenizer-self
The --hf-tokenizer-self flag makes the tokenizer configuration point to the uploaded model repository instead of the original tokenizer source.
Python API
You can also upload models programmatically:
Basic Upload
from open_clip.push_to_hf_hub import push_pretrained_to_hf_hub
push_pretrained_to_hf_hub(
model_name='ViT-B-32',
pretrained='/path/to/checkpoint.pt',
repo_id='myusername/my-clip-model',
commit_message='Upload trained CLIP model',
)
Upload with Custom Configuration
from open_clip.push_to_hf_hub import push_pretrained_to_hf_hub
push_pretrained_to_hf_hub(
model_name='ViT-L-14',
pretrained='/path/to/epoch_32.pt',
repo_id='myusername/vitl14-laion400m',
precision='fp16',
image_mean=(0.48145466, 0.4578275, 0.40821073),
image_std=(0.26862954, 0.26130258, 0.27577711),
image_interpolation='bicubic',
commit_message='Add ViT-L/14 trained on LAION-400M',
private=False,
)
Upload with Model Card
from open_clip.push_to_hf_hub import push_pretrained_to_hf_hub
model_card = {
'description': 'CLIP ViT-B/32 trained on CC12M dataset',
'details': {
'Dataset': 'CC12M',
'Architecture': 'ViT-B/32',
'Training samples': '12M',
'Epochs': '32',
},
'usage': """
## Usage
```python
import open_clip
import torch
from PIL import Image
model, _, preprocess = open_clip.create_model_and_transforms(
'hf-hub:myusername/my-clip-model'
)
model.eval()
image = preprocess(Image.open('image.jpg')).unsqueeze(0)
text = open_clip.tokenize(['a photo of a cat', 'a photo of a dog'])
with torch.no_grad():
image_features = model.encode_image(image)
text_features = model.encode_text(text)
# ... rest of inference code
""",
‘license’: ‘mit’,
}
push_pretrained_to_hf_hub(
model_name=‘ViT-B-32’,
pretrained=‘/path/to/checkpoint.pt’,
repo_id=‘myusername/my-clip-model’,
model_card=model_card,
)
## Advanced Usage
### Save Model Locally First
You can save the model files locally before uploading:
```python
import open_clip
from open_clip.push_to_hf_hub import save_for_hf
from pathlib import Path
# Load model
model, _, preprocess = open_clip.create_model_and_transforms(
'ViT-B-32',
pretrained='/path/to/checkpoint.pt'
)
# Get model config and tokenizer
model_config = open_clip.get_model_config('ViT-B-32')
tokenizer = open_clip.get_tokenizer('ViT-B-32')
# Save to local directory
save_directory = Path('./model_for_hub')
save_for_hf(
model=model,
tokenizer=tokenizer,
model_config=model_config,
save_directory=save_directory,
safe_serialization='both', # Save both .safetensors and .bin
)
print(f'Model saved to {save_directory}')
print('Files:', list(save_directory.glob('*')))
Manual Upload with Custom Files
from huggingface_hub import HfApi
from pathlib import Path
api = HfApi()
# Upload entire directory
api.upload_folder(
folder_path='./model_for_hub',
repo_id='myusername/my-clip-model',
repo_type='model',
commit_message='Upload CLIP model'
)
What Gets Uploaded
When you push a model to the Hub, the following files are created:
Model Weights
open_clip_pytorch_model.bin: PyTorch weights (pickle format)
open_clip_model.safetensors: SafeTensors weights (recommended)
Configuration
open_clip_config.json: Model architecture and preprocessing configuration
{
"model_cfg": {
"embed_dim": 512,
"vision_cfg": {...},
"text_cfg": {...}
},
"preprocess_cfg": {
"mean": [0.48145466, 0.4578275, 0.40821073],
"std": [0.26862954, 0.26130258, 0.27577711],
"interpolation": "bicubic",
"resize_mode": "shortest"
}
}
Tokenizer Files
tokenizer_config.json: Tokenizer configuration
vocab.json, merges.txt: Tokenizer vocabulary (for BPE tokenizers)
- Other tokenizer-specific files
Model Card
README.md: Automatically generated model card with metadata
Loading Uploaded Models
Once uploaded, anyone can load your model:
import open_clip
# Load from Hub
model, _, preprocess = open_clip.create_model_and_transforms(
'hf-hub:myusername/my-clip-model'
)
# Get tokenizer
tokenizer = open_clip.get_tokenizer('hf-hub:myusername/my-clip-model')
See the Loading Models guide for more details.
Model Card Customization
Create comprehensive model cards for better documentation:
model_card = {
'description': 'Detailed description of your model',
'details': {
'Model Type': 'Contrastive Vision-Language Model',
'Architecture': 'ViT-B/32',
'Dataset': 'Custom dataset description',
'Training Samples': '10M image-text pairs',
'Training Duration': '7 days on 8x A100',
'Preprocessing': 'Standard CLIP preprocessing',
},
'usage': 'Code examples...',
'comparison': 'Performance comparison with other models...',
'license': 'mit',
'citation': r"""
@software{my_clip_model,
title={My CLIP Model},
author={Your Name},
year={2024},
url={https://huggingface.co/myusername/my-clip-model}
}
""",
}
Best Practices
-
Use Descriptive Repo Names
myusername/CLIP-ViT-B-32-CC12M-32epochs
myusername/CLIP-convnext-large-LAION400M
-
Include Training Information
- Dataset name and size
- Training duration
- Key hyperparameters
- Performance metrics
-
Provide Usage Examples
- Include code snippets in model card
- Show both inference and fine-tuning
- Document any special requirements
-
Use SafeTensors Format
safe_serialization='both' # Upload both formats
-
Version Your Models
- Use tags or branches for different versions
- Document changes between versions
-
Test Before Uploading
# Test loading locally saved model
model, _, preprocess = open_clip.create_model_and_transforms(
'local-dir:./model_for_hub'
)
-
Add Relevant Tags
model_card = {
'tags': ['clip', 'vision', 'text', 'multimodal', 'zero-shot'],
...
}
Troubleshooting
Authentication Error
HTTPError: 401 Client Error: Unauthorized
Solution: Login to Hugging Face Hub
Repository Already Exists
HTTPError: 409 Client Error: Conflict
Solution: The repository name is already taken. Choose a different name or use your existing repo.
Large File Upload Issues
OSError: File is too large
Solution: Ensure git-lfs is installed
sudo apt-get install git-lfs
git lfs install
Missing Configuration
RuntimeError: Model config not found
Solution: Ensure the model name is correct and the config exists:
import open_clip
print(open_clip.list_models()) # Check available models
Tokenizer Issues
ValueError: Tokenizer type not recognized
Solution: For custom tokenizers, ensure the tokenizer files are included or use --hf-tokenizer-self.
Example Workflow
Complete workflow from training to Hub upload:
#!/bin/bash
# 1. Train model
python -m open_clip_train.main \
--model ViT-B-32 \
--train-data "/data/train.tar" \
--batch-size 256 \
--epochs 32 \
--logs ./logs \
--name my-clip-training
# 2. Find best checkpoint
ls -lh ./logs/my-clip-training/checkpoints/
# 3. Upload to Hub
python -m open_clip.push_to_hf_hub \
--model ViT-B-32 \
--pretrained ./logs/my-clip-training/checkpoints/epoch_32.pt \
--repo-id myusername/clip-vitb32-custom
# 4. Test loading from Hub
python -c "
import open_clip
model, _, preprocess = open_clip.create_model_and_transforms(
'hf-hub:myusername/clip-vitb32-custom'
)
print('Successfully loaded model from Hub!')
"
Additional Resources