Documentation Index Fetch the complete documentation index at: https://mintlify.com/Arvo-AI/aurora/llms.txt
Use this file to discover all available pages before exploring further.
Production Deployment Best Practices
Guidelines for deploying Aurora in production environments with security, reliability, and scalability.
Security
Secrets Management
Critical Security Requirements:
Never commit secrets to version control
Use strong, randomly generated passwords
Rotate credentials regularly
Use managed secrets services when available
Generate Strong Secrets
# Generate random secrets (32-byte base64)
openssl rand -base64 32
# Generate for all required secrets:
# - POSTGRES_PASSWORD
# - FLASK_SECRET_KEY
# - AUTH_SECRET
# - SEARXNG_SECRET
# - VAULT_TOKEN (from vault init)
Kubernetes Secrets
For Kubernetes deployments, consider using:
External Secrets Operator:
apiVersion : external-secrets.io/v1beta1
kind : SecretStore
metadata :
name : aws-secrets-manager
spec :
provider :
aws :
service : SecretsManager
region : us-east-1
---
apiVersion : external-secrets.io/v1beta1
kind : ExternalSecret
metadata :
name : aurora-secrets
spec :
secretStoreRef :
name : aws-secrets-manager
target :
name : aurora-app-secrets
data :
- secretKey : FLASK_SECRET_KEY
remoteRef :
key : aurora/flask-secret
Sealed Secrets:
# Encrypt secrets for git
kubeseal --format yaml < secret.yaml > sealed-secret.yaml
git add sealed-secret.yaml
Docker Compose Secrets
For Docker Compose, use .env file with restricted permissions:
chmod 600 .env
chown root:root .env # Or service account user
Or use Docker secrets:
secrets :
postgres_password :
file : ./secrets/postgres_password.txt
services :
postgres :
secrets :
- postgres_password
environment :
POSTGRES_PASSWORD_FILE : /run/secrets/postgres_password
Vault Configuration
Auto-Unseal with Cloud KMS
For production, configure Vault auto-unseal:
AWS KMS:
vault :
seal :
type : "awskms"
awskms :
region : "us-east-1"
kms_key_id : "alias/aurora-vault-unseal"
GCP Cloud KMS:
vault :
seal :
type : "gcpckms"
gcpckms :
project : "your-project-id"
region : "us-central1"
key_ring : "vault-keyring"
crypto_key : "vault-unseal-key"
Vault High Availability
For HA Vault:
replicaCounts :
vault : 3
vault :
ha :
enabled : true
raft :
enabled : true
Network Security
Kubernetes NetworkPolicies
Restrict pod-to-pod communication:
apiVersion : networking.k8s.io/v1
kind : NetworkPolicy
metadata :
name : aurora-server-policy
namespace : aurora
spec :
podSelector :
matchLabels :
app : aurora-server
policyTypes :
- Ingress
- Egress
ingress :
- from :
- podSelector :
matchLabels :
app : aurora-frontend
ports :
- protocol : TCP
port : 5080
egress :
# Allow DNS
- to :
- namespaceSelector :
matchLabels :
name : kube-system
ports :
- protocol : UDP
port : 53
# Allow database
- to :
- podSelector :
matchLabels :
app : postgres
ports :
- protocol : TCP
port : 5432
Pod Isolation for Untrusted Code
Enable pod isolation for terminal commands:
config :
ENABLE_POD_ISOLATION : "true"
TERMINAL_NAMESPACE : "untrusted"
TERMINAL_RUNTIME_CLASS : "gvisor" # Sandbox runtime
The chart creates NetworkPolicies that:
Block terminal pods from accessing cluster services (Vault, DB, etc.)
Allow internet access for cloud API calls
Isolate untrusted workloads
TLS/HTTPS Configuration
Ingress TLS with cert-manager
ingress :
enabled : true
tls :
enabled : true
certManager :
enabled : true
issuer : "letsencrypt-prod"
email : "admin@example.com"
hosts :
frontend : "aurora.example.com"
api : "api.aurora.example.com"
ws : "ws.aurora.example.com"
Install cert-manager:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.0/cert-manager.yaml
# Create ClusterIssuer
kubectl apply -f - << EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
EOF
Internal TLS (Service Mesh)
For encrypted internal traffic, use a service mesh:
Istio:
istioctl install --set profile=default
kubectl label namespace aurora istio-injection=enabled
Linkerd:
linkerd install | kubectl apply -f -
kubectl annotate namespace aurora linkerd.io/inject=enabled
Access Control
Kubernetes RBAC
Limit who can access Aurora resources:
apiVersion : rbac.authorization.k8s.io/v1
kind : Role
metadata :
name : aurora-admin
namespace : aurora
rules :
- apiGroups : [ "" , "apps" , "batch" ]
resources : [ "*" ]
verbs : [ "get" , "list" , "watch" , "create" , "update" , "patch" , "delete" ]
---
apiVersion : rbac.authorization.k8s.io/v1
kind : RoleBinding
metadata :
name : aurora-admin-binding
namespace : aurora
subjects :
- kind : User
name : admin@example.com
apiGroup : rbac.authorization.k8s.io
roleRef :
kind : Role
name : aurora-admin
apiGroup : rbac.authorization.k8s.io
Rate Limiting
Enable API rate limiting:
config :
RATE_LIMITING_ENABLED : "true"
RATE_LIMIT_HEADERS_ENABLED : "true"
secrets :
app :
RATE_LIMIT_BYPASS_TOKEN : "<secure-token-for-automation>"
Reliability
High Availability
Replica Configuration
replicaCounts :
# Scalable services (3+ for HA)
server : 3
celeryWorker : 5
chatbot : 2
frontend : 2
# Single instance (requires additional config for HA)
celeryBeat : 1 # DO NOT scale (causes duplicate tasks)
postgres : 1 # Use managed DB (RDS, Cloud SQL) for HA
redis : 1 # Use managed Redis (ElastiCache) for HA
vault : 1 # Configure Raft storage for HA
Pod Disruption Budgets
Prevent simultaneous pod evictions:
apiVersion : policy/v1
kind : PodDisruptionBudget
metadata :
name : aurora-server-pdb
namespace : aurora
spec :
minAvailable : 2
selector :
matchLabels :
app : aurora-server
Health Checks
Ensure proper health check configuration:
# Kubernetes
livenessProbe :
httpGet :
path : /health
port : 5080
initialDelaySeconds : 30
periodSeconds : 10
timeoutSeconds : 5
failureThreshold : 3
readinessProbe :
httpGet :
path : /ready
port : 5080
initialDelaySeconds : 10
periodSeconds : 5
timeoutSeconds : 3
failureThreshold : 2
Resource Management
Resource Requests and Limits
Set appropriate resource limits:
resources :
server :
requests :
cpu : "500m"
memory : "1Gi"
limits :
cpu : "2000m"
memory : "4Gi"
celeryWorker :
requests :
cpu : "200m"
memory : "2Gi"
limits :
cpu : "1000m"
memory : "8Gi"
postgres :
requests :
cpu : "1000m"
memory : "2Gi"
limits :
cpu : "4000m"
memory : "8Gi"
Horizontal Pod Autoscaling
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : aurora-server-hpa
namespace : aurora
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : aurora-oss-server
minReplicas : 3
maxReplicas : 10
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
- type : Resource
resource :
name : memory
target :
type : Utilization
averageUtilization : 80
Backup and Recovery
PostgreSQL Backups
Automated backups with CronJob:
apiVersion : batch/v1
kind : CronJob
metadata :
name : postgres-backup
namespace : aurora
spec :
schedule : "0 2 * * *" # Daily at 2 AM
jobTemplate :
spec :
template :
spec :
containers :
- name : backup
image : postgres:15-alpine
env :
- name : PGHOST
value : aurora-oss-postgres
- name : PGUSER
value : aurora
- name : PGPASSWORD
valueFrom :
secretKeyRef :
name : aurora-db-secret
key : POSTGRES_PASSWORD
command :
- /bin/sh
- -c
- |
pg_dump -Fc aurora_db > /backup/aurora_$(date +%Y%m%d_%H%M%S).dump
aws s3 cp /backup/*.dump s3://aurora-backups/postgres/
volumeMounts :
- name : backup
mountPath : /backup
volumes :
- name : backup
emptyDir : {}
restartPolicy : OnFailure
Managed Database Backups:
Use cloud provider automated backups:
AWS RDS: Automated snapshots, point-in-time recovery
GCP Cloud SQL: Automated backups, replicas
Azure Database: Geo-redundant backups
Volume Snapshots
# Create VolumeSnapshot
kubectl apply -f - << EOF
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: postgres-snapshot-$( date +%Y%m%d)
namespace: aurora
spec:
volumeSnapshotClassName: csi-snapclass
source:
persistentVolumeClaimName: data-aurora-oss-postgres-0
EOF
Disaster Recovery Plan
Regular backups : Daily PostgreSQL dumps, hourly volume snapshots
Multi-region replication : Replicate backups to separate region
Test restores : Monthly restore tests to staging environment
Documentation : Maintain runbook for recovery procedures
Monitoring : Alert on backup failures
Monitoring and Observability
Prometheus Metrics
Enable Prometheus monitoring:
config :
OTEL_SERVICE_NAME : "aurora-production"
OTEL_EXPORTER_OTLP_ENDPOINT : "http://prometheus:9090"
Logging
Centralized logging with ELK or Loki:
apiVersion : v1
kind : ConfigMap
metadata :
name : fluent-bit-config
namespace : aurora
data :
fluent-bit.conf : |
[INPUT]
Name tail
Path /var/log/containers/aurora-*.log
Parser docker
Tag aurora.*
[OUTPUT]
Name es
Match aurora.*
Host elasticsearch.logging.svc.cluster.local
Port 9200
Index aurora
Type _doc
Alerting
apiVersion : monitoring.coreos.com/v1
kind : PrometheusRule
metadata :
name : aurora-alerts
namespace : aurora
spec :
groups :
- name : aurora
interval : 30s
rules :
- alert : AuroraPodDown
expr : up{job="aurora-server"} == 0
for : 5m
labels :
severity : critical
annotations :
summary : "Aurora server pod is down"
- alert : HighMemoryUsage
expr : container_memory_usage_bytes{pod=~"aurora-.*"} / container_spec_memory_limit_bytes > 0.9
for : 10m
labels :
severity : warning
annotations :
summary : "Pod {{ $labels.pod }} is using > 90% memory"
Operations
Deployment Strategy
Rolling Updates
apiVersion : apps/v1
kind : Deployment
spec :
strategy :
type : RollingUpdate
rollingUpdate :
maxSurge : 1
maxUnavailable : 0
Blue-Green Deployment
# Deploy new version to separate namespace
helm install aurora-v2 ./deploy/helm/aurora \
--namespace aurora-v2 --create-namespace \
-f values.generated.yaml
# Switch traffic via ingress
kubectl patch ingress aurora-oss -n aurora -p '{"spec":{"rules":[{"host":"api.aurora.example.com","http":{"paths":[{"path":"/","pathType":"Prefix","backend":{"service":{"name":"aurora-v2-server","port":{"number":5080}}}}]}}]}}'
# Cleanup old version
helm uninstall aurora-oss -n aurora
Maintenance Windows
Database Migrations
# Run migrations before deployment
kubectl exec -it deployment/aurora-oss-server -n aurora -- \
python -m flask db upgrade
# Verify schema version
kubectl exec -it statefulset/aurora-oss-postgres -n aurora -- \
psql -U aurora -d aurora_db -c "SELECT version_num FROM alembic_version;"
Scaling Down for Maintenance
# Scale to 0
kubectl scale deployment aurora-oss-server --replicas=0 -n aurora
# Perform maintenance
# ...
# Scale back up
kubectl scale deployment aurora-oss-server --replicas=3 -n aurora
Cost Optimization
Use Managed Services
Replace in-cluster stateful services with managed alternatives:
Database : RDS, Cloud SQL, Azure Database (automated backups, HA)
Redis : ElastiCache, Memorystore, Azure Cache (managed persistence)
Object Storage : S3, GCS, Azure Blob (eliminate SeaweedFS)
Secrets : AWS Secrets Manager, GCP Secret Manager, Azure Key Vault
Resource Right-Sizing
Monitor actual usage and adjust:
# Check resource usage
kubectl top pods -n aurora
kubectl top nodes
# Use VPA recommendations
kubectl get vpa -n aurora
Node Autoscaling
# Cluster Autoscaler (cloud providers)
# Scales nodes based on pending pods
Checklist
Before going to production:
Next Steps
Scaling Guide Scale Aurora for growing workloads
Monitoring Setup Set up comprehensive monitoring
Backup & Recovery Implement backup strategies
Troubleshooting Common issues and solutions