Documentation Index
Fetch the complete documentation index at: https://mintlify.com/cvat-ai/cvat/llms.txt
Use this file to discover all available pages before exploring further.
Deploy CVAT on Amazon Web Services (AWS) to leverage cloud infrastructure, GPU instances for serverless auto-annotation, and managed services for production deployments.
Deployment Options
CVAT can be deployed on AWS in multiple ways:
- EC2 with Docker Compose: Simple deployment on a single instance
- EC2 with GPU (P3 instances): For auto-annotation with TensorFlow models
- Amazon EKS: Production Kubernetes deployment
- Hybrid: EKS with managed AWS services (RDS, ElastiCache, EFS)
EC2 Deployment
Prerequisites
- AWS account with appropriate permissions
- AWS CLI installed and configured
- SSH key pair created in your AWS region
1. Launch EC2 Instance
Instance Types:
| Use Case | Instance Type | vCPU | RAM | Storage |
|---|
| Development/Testing | t3.large | 2 | 8GB | 50GB |
| Small Production | t3.xlarge | 4 | 16GB | 100GB |
| Medium Production | m5.2xlarge | 8 | 32GB | 200GB |
| With GPU (Auto-annotation) | p3.2xlarge | 8 | 61GB | 200GB |
Launch using AWS CLI:
# Set variables
REGION=us-east-1
INSTANCE_TYPE=t3.xlarge
KEY_NAME=your-key-pair
SECURITY_GROUP=sg-xxxxxxxxx
SUBNET=subnet-xxxxxxxxx
# Launch instance
aws ec2 run-instances \
--image-id $(aws ec2 describe-images \
--owners 099720109477 \
--filters "Name=name,Values=ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*" \
--query 'sort_by(Images, &CreationDate)[-1].ImageId' \
--output text \
--region $REGION) \
--instance-type $INSTANCE_TYPE \
--key-name $KEY_NAME \
--security-group-ids $SECURITY_GROUP \
--subnet-id $SUBNET \
--block-device-mappings 'DeviceName=/dev/sda1,Ebs={VolumeSize=100,VolumeType=gp3}' \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=cvat-server}]' \
--region $REGION
Allow incoming traffic on required ports:
# Create security group
SG_ID=$(aws ec2 create-security-group \
--group-name cvat-sg \
--description "CVAT security group" \
--region $REGION \
--output text)
# Allow SSH
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 22 \
--cidr 0.0.0.0/0 \
--region $REGION
# Allow HTTP
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 80 \
--cidr 0.0.0.0/0 \
--region $REGION
# Allow HTTPS
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 443 \
--cidr 0.0.0.0/0 \
--region $REGION
# Allow CVAT port (if not using 80/443)
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp \
--port 8080 \
--cidr 0.0.0.0/0 \
--region $REGION
Using AWS Console:
- Navigate to EC2 → Security Groups
- Create security group with inbound rules:
- SSH (22): Your IP or 0.0.0.0/0
- HTTP (80): 0.0.0.0/0
- HTTPS (443): 0.0.0.0/0
- Custom TCP (8080): 0.0.0.0/0 (if needed)
3. Connect and Install Docker
# Get instance public IP
INSTANCE_IP=$(aws ec2 describe-instances \
--filters "Name=tag:Name,Values=cvat-server" "Name=instance-state-name,Values=running" \
--query 'Reservations[0].Instances[0].PublicIpAddress' \
--output text \
--region $REGION)
# SSH into instance
ssh -i ~/.ssh/${KEY_NAME}.pem ubuntu@${INSTANCE_IP}
Once connected, install Docker:
# Update packages
sudo apt-get update
sudo apt-get upgrade -y
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Add user to docker group
sudo usermod -aG docker ubuntu
newgrp docker
# Verify installation
docker --version
docker compose version
4. Deploy CVAT
# Clone repository
git clone https://github.com/cvat-ai/cvat
cd cvat
# Get public hostname
CVAT_HOST=$(curl -s http://169.254.169.254/latest/meta-data/public-hostname)
# Or use public IP
# CVAT_HOST=$(curl -s http://169.254.169.254/latest/meta-data/public-ipv4)
# Export hostname
export CVAT_HOST=${CVAT_HOST}
# Start CVAT
docker compose pull
docker compose up -d
# Create superuser
docker exec -it cvat_server bash -ic 'python manage.py createsuperuser'
5. Access CVAT
# Get your access URL
echo "Access CVAT at: http://${CVAT_HOST}:8080"
For production with SSL:
# Install Certbot (outside Docker) or use docker-compose.https.yml
export ACME_EMAIL=admin@example.com
export CVAT_HOST=cvat.yourdomain.com
# Point your domain to instance IP first
# Then deploy with HTTPS
docker compose -f docker-compose.yml -f docker-compose.https.yml up -d
GPU Instance Deployment (P3)
For Auto-Annotation with TensorFlow
P3 instances provide NVIDIA GPUs for running deep learning models.
1. Launch P3 Instance
# P3 instances available:
# p3.2xlarge: 1 GPU (V100), 8 vCPUs, 61GB RAM
# p3.8xlarge: 4 GPUs (V100), 32 vCPUs, 244GB RAM
# p3.16xlarge: 8 GPUs (V100), 64 vCPUs, 488GB RAM
aws ec2 run-instances \
--image-id ami-xxxxxxxxx \
--instance-type p3.2xlarge \
--key-name $KEY_NAME \
--security-group-ids $SG_ID \
--block-device-mappings 'DeviceName=/dev/sda1,Ebs={VolumeSize=200,VolumeType=gp3}' \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=cvat-gpu}]' \
--region $REGION
2. Install NVIDIA Drivers
# SSH into P3 instance
ssh -i ~/.ssh/${KEY_NAME}.pem ubuntu@${INSTANCE_IP}
# Update system
sudo apt-get update
sudo apt-get upgrade -y
# Install NVIDIA drivers
sudo apt-get install -y linux-headers-$(uname -r)
sudo apt-get install -y nvidia-driver-535
# Reboot
sudo reboot
After reboot, reconnect and verify:
# Check NVIDIA driver
nvidia-smi
# Install Docker (if not already installed)
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker
# Install NVIDIA Container Toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
# Test GPU in Docker
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
4. Deploy CVAT with Serverless Functions
git clone https://github.com/cvat-ai/cvat
cd cvat
export CVAT_HOST=$(curl -s http://169.254.169.254/latest/meta-data/public-hostname)
# Enable Nuclio for serverless functions
# Edit docker-compose.yml or use override
docker compose up -d
# Deploy serverless functions
cd serverless
# Follow serverless deployment guide
EKS Deployment
Prerequisites
- eksctl installed
- kubectl installed
- AWS CLI configured
1. Create EKS Cluster
# Create cluster configuration
cat > cvat-cluster.yaml <<EOF
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: cvat-cluster
region: us-east-1
version: "1.28"
managedNodeGroups:
- name: cvat-nodes
instanceType: m5.xlarge
desiredCapacity: 3
minSize: 2
maxSize: 5
volumeSize: 100
ssh:
allow: true
publicKeyName: your-key-pair
tags:
nodegroup-role: worker
iam:
withAddonPolicies:
ebs: true
efs: true
albIngress: true
EOF
# Create cluster
eksctl create cluster -f cvat-cluster.yaml
This takes 15-20 minutes.
# Update kubeconfig
aws eks update-kubeconfig --region us-east-1 --name cvat-cluster
# Verify
kubectl get nodes
3. Install Storage Driver
# Install EBS CSI driver
kubectl apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.25"
# Create storage class
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: cvat-storage
provisioner: ebs.csi.aws.com
parameters:
type: gp3
encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
EOF
4. Install CVAT with Helm
# Add Helm repo
helm repo add cvat https://cvat-ai.github.io/cvat/
helm repo update
# Create namespace
kubectl create namespace cvat
# Create values file
cat > cvat-eks-values.yaml <<EOF
cvat:
backend:
defaultStorage:
storageClassName: cvat-storage
size: 100Gi
kvrocks:
defaultStorage:
storageClassName: cvat-storage
size: 200Gi
postgresql:
primary:
persistence:
storageClass: cvat-storage
size: 50Gi
ingress:
enabled: true
hostname: cvat.example.com
className: alb
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
EOF
# Install
helm install cvat cvat/cvat -n cvat -f cvat-eks-values.yaml
# Install AWS Load Balancer Controller
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=cvat-cluster
# Get load balancer URL
kubectl get ingress -n cvat
AWS Managed Services Integration
Using Amazon RDS for PostgreSQL
1. Create RDS Instance:
aws rds create-db-instance \
--db-instance-identifier cvat-db \
--db-instance-class db.t3.medium \
--engine postgres \
--engine-version 15.4 \
--master-username cvat \
--master-user-password YourSecurePassword \
--allocated-storage 100 \
--storage-type gp3 \
--vpc-security-group-ids $SG_ID \
--backup-retention-period 7 \
--region $REGION
2. Configure CVAT:
# Docker Compose
services:
cvat_server:
environment:
CVAT_POSTGRES_HOST: cvat-db.xxxx.us-east-1.rds.amazonaws.com
CVAT_POSTGRES_PORT: 5432
CVAT_POSTGRES_USER: cvat
CVAT_POSTGRES_PASSWORD: YourSecurePassword
CVAT_POSTGRES_DBNAME: cvat
# Kubernetes Helm values
postgresql:
enabled: false
external:
host: cvat-db.xxxx.us-east-1.rds.amazonaws.com
port: 5432
auth:
username: cvat
database: cvat
password: YourSecurePassword
Using Amazon ElastiCache for Redis
1. Create ElastiCache Cluster:
aws elasticache create-cache-cluster \
--cache-cluster-id cvat-redis \
--engine redis \
--engine-version 7.0 \
--cache-node-type cache.t3.medium \
--num-cache-nodes 1 \
--security-group-ids $SG_ID \
--region $REGION
2. Configure CVAT:
redis:
enabled: false
external:
host: cvat-redis.xxxx.cache.amazonaws.com
auth:
password: "" # Configure if AUTH enabled
Using Amazon EFS for Shared Storage
1. Create EFS:
EFS_ID=$(aws efs create-file-system \
--region $REGION \
--performance-mode generalPurpose \
--throughput-mode bursting \
--encrypted \
--tags Key=Name,Value=cvat-efs \
--query 'FileSystemId' \
--output text)
# Create mount targets
for subnet in $SUBNET1 $SUBNET2 $SUBNET3; do
aws efs create-mount-target \
--file-system-id $EFS_ID \
--subnet-id $subnet \
--security-groups $SG_ID \
--region $REGION
done
2. Install EFS CSI Driver in EKS:
kubectl apply -k "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.7"
3. Create StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap
fileSystemId: fs-xxxxxxxxx
directoryPerms: "700"
4. Use in CVAT:
cvat:
backend:
defaultStorage:
storageClassName: efs-sc
accessModes:
- ReadWriteMany
Using Amazon S3 for Storage
Configure CVAT to use S3 for dataset storage:
cvat:
backend:
additionalEnv:
- name: AWS_S3_BUCKET_NAME
value: cvat-datasets
- name: AWS_S3_REGION
value: us-east-1
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret-access-key
Cost Optimization
1. Use Spot Instances
For EKS worker nodes:
managedNodeGroups:
- name: cvat-spot
instanceTypes:
- m5.xlarge
- m5a.xlarge
- m5n.xlarge
spot: true
desiredCapacity: 3
2. Auto-Scaling
# Enable cluster autoscaler
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
3. Use Reserved Instances
For stable workloads, purchase Reserved Instances for cost savings.
4. S3 Lifecycle Policies
aws s3api put-bucket-lifecycle-configuration \
--bucket cvat-datasets \
--lifecycle-configuration file://lifecycle.json
{
"Rules": [
{
"Id": "MoveToIA",
"Status": "Enabled",
"Transitions": [
{
"Days": 90,
"StorageClass": "STANDARD_IA"
}
]
}
]
}
Handling Instance Restarts
Problem
AWS EC2 instances change public IP/hostname when stopped and restarted.
Solutions
1. Use Elastic IP:
# Allocate Elastic IP
ALLOCATION_ID=$(aws ec2 allocate-address --region $REGION --query 'AllocationId' --output text)
# Associate with instance
aws ec2 associate-address \
--instance-id $INSTANCE_ID \
--allocation-id $ALLOCATION_ID \
--region $REGION
# Use in CVAT
export CVAT_HOST=$(aws ec2 describe-addresses --allocation-ids $ALLOCATION_ID --query 'Addresses[0].PublicIp' --output text)
2. Use Route 53 DNS:
# Create hosted zone
HOSTED_ZONE_ID=$(aws route53 create-hosted-zone \
--name example.com \
--caller-reference $(date +%s) \
--query 'HostedZone.Id' \
--output text)
# Create A record
cat > change-batch.json <<EOF
{
"Changes": [{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "cvat.example.com",
"Type": "A",
"TTL": 300,
"ResourceRecords": [{"Value": "${INSTANCE_IP}"}]
}
}]
}
EOF
aws route53 change-resource-record-sets \
--hosted-zone-id $HOSTED_ZONE_ID \
--change-batch file://change-batch.json
# Use in CVAT
export CVAT_HOST=cvat.example.com
3. Avoid Spot Instances:
Don’t use Spot instances for stateful CVAT deployments. Use On-Demand or Reserved Instances.
Monitoring and Logging
CloudWatch Integration
1. Install CloudWatch Agent:
# On EC2
wget https://s3.amazonaws.com/amazoncloudwatch-agent/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb
sudo dpkg -i amazon-cloudwatch-agent.deb
# Configure
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
2. For EKS:
# Install Fluent Bit
kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluent-bit-quickstart.yaml
Application Monitoring
CVAT includes Grafana for analytics:
# Access Grafana
echo "http://${CVAT_HOST}:8080/analytics"
Backup and Disaster Recovery
Automated Backups
RDS: Automatic backups enabled (7-35 days retention)
EBS Snapshots:
# Create snapshot
aws ec2 create-snapshot \
--volume-id $VOLUME_ID \
--description "CVAT data backup $(date +%Y%m%d)" \
--region $REGION
# Automate with AWS Backup
aws backup create-backup-plan --cli-input-json file://backup-plan.json
S3 Replication:
Enable cross-region replication for S3 buckets.
Troubleshooting
# If instance metadata is not accessible
CVAT_HOST=$(curl -s http://169.254.169.254/latest/meta-data/public-ipv4)
# Or manually set
export CVAT_HOST=YOUR_PUBLIC_IP
Security Group Misconfig
# Verify security group
aws ec2 describe-security-groups --group-ids $SG_ID --region $REGION
Storage Full
# Resize EBS volume
aws ec2 modify-volume --volume-id $VOLUME_ID --size 200 --region $REGION
# Extend filesystem
sudo growpart /dev/nvme0n1 1
sudo resize2fs /dev/nvme0n1p1
Security Best Practices
- Use IAM roles instead of access keys
- Enable VPC for database and cache isolation
- Use Secrets Manager for credentials
- Enable AWS WAF for ingress protection
- Regular security patches with Systems Manager
- Enable CloudTrail for audit logging
- Use private subnets for EKS worker nodes
- Encrypt EBS volumes and S3 buckets
Next Steps