Skip to main content

Overview

In a GitOps setup, Git is the source of truth. Most cluster state can be restored by re-applying the Git repository. However, some components require explicit backup:
  • Sealed Secrets private key: Required to decrypt SealedSecret resources
  • etcd data: Kubernetes cluster state (optional, for rapid recovery)
  • Persistent volumes: Application data (application-specific)

GitOps Backup Strategy

Git Repository as Backup

All cluster configuration is stored in Git:
ssh://[email protected]/kim-ae/kimbernetes-k8s-flux
What’s backed up in Git:
  • Flux configuration (cluster/kimawesome/)
  • HelmRelease definitions (overlays/base/*/helm-release.yaml)
  • Kustomizations and overlays
  • Kubernetes manifests (Deployments, Services, etc.)
  • SealedSecret resources (encrypted)
What’s NOT backed up in Git:
  • Sealed Secrets private key (stored in cluster)
  • Kubernetes Secrets (generated by applications)
  • etcd cluster state
  • Persistent volume data

Backup Git Repository

1

Clone repository

git clone ssh://[email protected]/kim-ae/kimbernetes-k8s-flux kimbernetes-backup
cd kimbernetes-backup
2

Create archive

tar czf kimbernetes-$(date +%Y%m%d).tar.gz kimbernetes-backup/
3

Store securely

  • Upload to S3/cloud storage
  • Store on separate physical location
  • Encrypt if storing on untrusted storage
GitHub already provides repository backups, but a separate backup protects against account compromise or accidental deletion.

Backing Up Sealed Secrets

Why Backup?

The sealed-secrets controller generates a private key on first installation. This key is required to decrypt all SealedSecret resources. Without this key, you cannot decrypt your secrets.

Backup the Private Key

1

Export the private key

kubectl get secret -n sealed-secrets \
  -l sealedsecrets.bitnami.com/sealed-secrets-key=active \
  -o yaml > sealed-secrets-private-key.yaml
2

Encrypt the backup

# Encrypt with GPG
gpg --symmetric --cipher-algo AES256 sealed-secrets-private-key.yaml

# Or use age
age -e -o sealed-secrets-private-key.yaml.age sealed-secrets-private-key.yaml
3

Store securely

  • Store in password manager (1Password, Bitwarden)
  • Upload to encrypted cloud storage
  • Store offline in secure location
# Upload to S3 (example)
aws s3 cp sealed-secrets-private-key.yaml.gpg \
  s3://my-backups/kimbernetes/sealed-secrets-key-$(date +%Y%m%d).yaml.gpg
4

Delete unencrypted copy

shred -u sealed-secrets-private-key.yaml
The sealed-secrets private key grants access to ALL encrypted secrets in your cluster. Treat it like a root password.

Restore Sealed Secrets Key

If you need to restore the private key to a new cluster:
1

Decrypt backup

gpg -d sealed-secrets-private-key.yaml.gpg > sealed-secrets-private-key.yaml
2

Install sealed-secrets controller

Wait for Flux to install sealed-secrets, or install manually:
flux reconcile helmrelease sealed-secrets -n sealed-secrets
kubectl -n sealed-secrets wait --for=condition=ready pod -l app.kubernetes.io/name=sealed-secrets
3

Delete auto-generated key

kubectl delete secret -n sealed-secrets \
  -l sealedsecrets.bitnami.com/sealed-secrets-key=active
4

Restore the key

kubectl apply -f sealed-secrets-private-key.yaml
5

Restart sealed-secrets controller

kubectl -n sealed-secrets rollout restart deployment sealed-secrets-controller
kubectl -n sealed-secrets wait --for=condition=ready pod -l app.kubernetes.io/name=sealed-secrets
6

Verify decryption

# Check that SealedSecrets are being decrypted
kubectl get sealedsecrets -A
kubectl get secrets -A | grep sealed

Backing Up etcd

Why Backup etcd?

etcd stores all Kubernetes cluster state. While GitOps can recreate most resources, an etcd backup enables:
  • Rapid cluster recovery
  • Restoration of runtime state (not in Git)
  • Recovery from catastrophic failures

Backup etcd (kubeadm cluster)

1

SSH to control plane node

2

Run etcdctl snapshot

sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /var/backups/etcd-snapshot-$(date +%Y%m%d-%H%M%S).db
3

Verify snapshot

sudo ETCDCTL_API=3 etcdctl \
  --write-out=table \
  snapshot status /var/backups/etcd-snapshot-*.db
4

Copy backup offsite

scp [email protected]:/var/backups/etcd-snapshot-*.db ~/backups/

Automate etcd Backups

Create a CronJob to backup etcd regularly:
apiVersion: batch/v1
kind: CronJob
metadata:
  name: etcd-backup
  namespace: kube-system
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          hostNetwork: true
          nodeSelector:
            node-role.kubernetes.io/control-plane: ""
          containers:
          - name: etcd-backup
            image: registry.k8s.io/etcd:3.5.17-0
            command:
            - /bin/sh
            - -c
            - |
              ETCDCTL_API=3 etcdctl \
                --endpoints=https://127.0.0.1:2379 \
                --cacert=/etc/kubernetes/pki/etcd/ca.crt \
                --cert=/etc/kubernetes/pki/etcd/server.crt \
                --key=/etc/kubernetes/pki/etcd/server.key \
                snapshot save /backup/etcd-$(date +%Y%m%d-%H%M%S).db
            volumeMounts:
            - name: etcd-certs
              mountPath: /etc/kubernetes/pki/etcd
              readOnly: true
            - name: backup
              mountPath: /backup
          volumes:
          - name: etcd-certs
            hostPath:
              path: /etc/kubernetes/pki/etcd
          - name: backup
            hostPath:
              path: /var/backups/etcd
          restartPolicy: OnFailure
Combine this with a backup tool like Velero to automatically upload etcd snapshots to S3.

Restore etcd

Restoring etcd will overwrite ALL cluster state. Only use as a last resort.
1

Stop Kubernetes components

sudo systemctl stop kubelet
sudo mv /etc/kubernetes/manifests /etc/kubernetes/manifests.bak
2

Restore etcd snapshot

sudo ETCDCTL_API=3 etcdctl snapshot restore /var/backups/etcd-snapshot.db \
  --data-dir=/var/lib/etcd-restore
3

Replace etcd data directory

sudo mv /var/lib/etcd /var/lib/etcd.old
sudo mv /var/lib/etcd-restore /var/lib/etcd
4

Restart Kubernetes

sudo mv /etc/kubernetes/manifests.bak /etc/kubernetes/manifests
sudo systemctl start kubelet
5

Verify cluster

kubectl get nodes
kubectl get pods -A

Disaster Recovery: Rebuild Cluster from Git

Complete Cluster Loss

If the entire cluster is lost, rebuild from Git:
1

Recreate Kubernetes cluster

Follow the cluster creation steps in overlays/kimawesome/README.md:
# Configure kubeadm
sudo kubeadm init --skip-phases=addon/kube-proxy \
  --apiserver-advertise-address=192.168.0.101 \
  --pod-network-cidr="10.1.0.0/16" \
  --upload-certs

# Install Cilium
cilium install --set kubeProxyReplacement=true \
  --set k8sServiceHost=192.168.0.101 \
  --set k8sServicePort=6443 \
  --set nodePort.enabled=true \
  --set gatewayAPI.enabled=true

# Wait for Cilium
cilium status --wait
2

Bootstrap Flux

export GITHUB_TOKEN="<your-token>"
flux bootstrap github \
  --owner=kim-ae \
  --repository=kimbernetes-k8s-flux \
  --private=false \
  --personal=true \
  --path=cluster/kimawesome \
  --components-extra='image-reflector-controller,image-automation-controller'
3

Restore sealed-secrets key

Before Flux creates SealedSecrets:
# Wait for sealed-secrets controller
kubectl -n sealed-secrets wait --for=condition=ready pod \
  -l app.kubernetes.io/name=sealed-secrets --timeout=300s

# Delete auto-generated key
kubectl delete secret -n sealed-secrets \
  -l sealedsecrets.bitnami.com/sealed-secrets-key=active

# Restore backed-up key
kubectl apply -f sealed-secrets-private-key.yaml

# Restart controller
kubectl -n sealed-secrets rollout restart deployment sealed-secrets-controller
4

Monitor Flux reconciliation

flux get all
watch kubectl get pods -A
Flux will automatically deploy all applications from Git.
5

Verify applications

kubectl get helmreleases -A
kubectl get certificates -A
kubectl get gateways -A
Recovery time depends on the number of HelmReleases. Expect 10-30 minutes for full reconciliation.

Backing Up Persistent Volumes

Identify PVs

kubectl get pv
kubectl get pvc -A

Backup with Velero

Velero is the standard tool for Kubernetes backups:
# Install Velero
velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.10.0 \
  --bucket kimbernetes-backups \
  --backup-location-config region=us-east-1 \
  --snapshot-location-config region=us-east-1

# Backup entire namespace
velero backup create myapp-backup --include-namespaces myapp

# Backup specific resources
velero backup create certs-backup \
  --include-resources certificates,secrets \
  --include-namespaces cert-manager

# Schedule automatic backups
velero schedule create daily-backup --schedule="0 2 * * *"

Manual PV Backup

For local volumes:
# Find PV host path
kubectl get pv <pv-name> -o yaml | grep path

# SSH to node and backup
ssh kim@node-01
sudo tar czf /var/backups/pv-backup-$(date +%Y%m%d).tar.gz /var/lib/kubelet/volumes

Backup Checklist

  • Git repository backed up offsite
  • Sealed Secrets private key backed up and encrypted
  • etcd snapshots scheduled (daily recommended)
  • Persistent volumes backed up (if applicable)
  • Backup restoration tested (quarterly recommended)
  • Disaster recovery plan documented
  • Team members have access to backups

Testing Recovery

Test your backups regularly:
# Create test cluster (minikube)
minikube start --kubernetes-version=v1.33.0

# Bootstrap Flux
flux bootstrap github --path=cluster/minikube

# Restore sealed-secrets key
kubectl apply -f sealed-secrets-private-key.yaml

# Verify applications deploy
flux get helmreleases -A

Best Practices

  • Test backups: Regularly test restoration procedures
  • Automate: Use CronJobs or external tools for automatic backups
  • Offsite storage: Store backups in different physical location/cloud
  • Encrypt: Encrypt sensitive backups (sealed-secrets keys, etcd)
  • Version backups: Keep multiple backup versions (daily, weekly, monthly)
  • Document: Keep disaster recovery runbook up to date
  • Monitor: Alert on backup failures

Next Steps

Build docs developers (and LLMs) love