Datamailer deploys to AWS using a single CloudFormation template that provisions all infrastructure for a given environment — staging or production — from one parameter file. The template is intentionally practical: definitions are validated locally without requiring AWS credentials in CI, while account-specific IDs, DNS records, SES production access, alarm subscribers, and restore drills remain human-verified checks.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/DataTalksClub/datamailer/llms.txt
Use this file to discover all available pages before exploring further.
Infrastructure Shape
Django Web
The Django application runs on a single ARM EC2 instance (t4g.nano for staging, t4g.micro for production by default). Caddy terminates HTTPS via Let’s Encrypt and reverse-proxies to Gunicorn, which is managed by systemd as the datamailer.service unit. Static files are served by WhiteNoise directly from Django. CloudWatch agent ships logs and system metrics; SSM Session Manager is the preferred admin access method.
Postgres
RDS Postgres runs in private subnets with encrypted storage, deletion snapshots, and automated backups. The web application and Lambda workers access the database only through security groups. Separate database credentials are used where practical:datamailer_app for the web host and datamailer_worker for Lambda.
SQS Queues
Four separate SQS standard queues and their dead-letter queues are provisioned:| Queue | DLQ | Purpose |
|---|---|---|
transactional-email | transactional-email-dlq | High-priority account, verification, and password reset email |
campaign-email | campaign-email-dlq | Campaign recipient batch send jobs |
ses-webhooks | ses-webhooks-dlq | Asynchronous SES provider event notifications |
email-events | email-events-dlq | Optional async tracking and event ingest |
Lambda Workers
Workers run on Python 3.12arm64. Each worker has a dedicated IAM role scoped to its own source queue, DLQ, and log group. Conservative reserved concurrency limits are applied at launch: transactional 4, campaign 2, SES webhooks 2, email events 1.
SES
A per-environment SES configuration set and sender identity are parameterized in the CloudFormation template. DNS verification, sandbox exit, and production send quota approval are human checks.Monitoring
CloudWatch log groups are created for each worker with configurable retention (14 days staging, 30 days production by default). Alarms and a dashboard cover queue age, DLQ depth, Lambda errors/throttles/duration, SES bounces/complaints, DB CPU/storage/connections, web health, stuck campaigns, and transactional queue latency.CloudFormation Files
Deploy Flow
Build Lambda Artifact
Build and upload a Lambda artifact zip containing the Django project and its dependencies to the environment artifact bucket. The zip is referenced by
LambdaArtifactKey in the parameter file.Bake AMI
Bake or select an ARM64 AMI that includes Python 3.12,
uv, Caddy, the CloudWatch agent, and a datamailer.service systemd unit. Alternatively, reuse an existing baked AMI and pass its ID as WebAmiId.Fill Parameters
Copy the example parameter file to a private file and replace every
REPLACE placeholder with real account-specific values — VPC/subnet IDs, certificate ARN, artifact bucket, SES identity, alarm SNS topic, and database credentials.Validate Locally
Run local validation before touching AWS. This script checks the CloudFormation template structure and parameter completeness without making any AWS API calls.
Deploy CloudFormation Stack
Deploy or update the stack.
CAPABILITY_NAMED_IAM is required because the template creates named IAM roles for each Lambda worker.Set Up the Web Host
On the web host, render
/etc/datamailer/environment from Secrets Manager and SSM Parameter Store values plus the CloudFormation stack outputs. Then run the release steps:Run Smoke Tests
Run the smoke test script against the staging environment. HTTP health checks are automated; AWS queue round-trip checks run when queue URLs and credentials are provided; remaining promotion checks are printed as human tasks.
Lambda IAM Roles
Each Lambda worker has a dedicated runtime role with the minimum permissions required for its function. Worker roles use inline runtime permissions rather than broad Lambda execution managed policies so log writes stay scoped to the worker’s own log group.| Role | Queue Access | SES | Secrets |
|---|---|---|---|
TransactionalEmailWorkerRole | Read/delete transactional-email; write transactional-email-dlq | Send email | Read DB secret |
CampaignEmailWorkerRole | Read/delete campaign-email; write campaign-email-dlq | Send email | Read DB secret |
SesWebhooksWorkerRole | Read/delete ses-webhooks; write ses-webhooks-dlq | None | Read DB secret |
EmailEventsWorkerRole | Read/delete email-events; write email-events-dlq | None | Read DB secret |
The
EmailEventsWorkerRole event-source mapping is intentionally disabled at launch. Enable it only when optional async event processing is turned on.Postgres Connection Management
Lambda concurrency can exhaust Postgres connections if left unchecked. The following controls are applied from launch:- Conservative reserved concurrency per worker (transactional 4, campaign 2, SES webhooks 2, email events 1).
- Short database transactions in all worker code paths.
- Small send batch sizes to limit per-invocation connection hold time.
- Separate database user for workers (
datamailer_worker) with least privilege.
DatabaseConnections alarms fire or connection wait errors appear, first lower the Lambda event-source maximum concurrency. Add RDS Proxy when sustained worker pressure requires it.
SES Requirements
Before routing any production traffic through SES, verify all of the following:
- Verified sender identity —
SESSenderIdentityis verified in the SES console for the target region. - DNS records — DKIM, SPF, DMARC, and optional custom MAIL FROM records are published and validate in SES.
- Sandbox exit — SES production access (sandbox exit) and send quota are approved for the account.
- Bounce and complaint routing — the
ses-webhooksqueue drains, the worker logs notifications, theses-webhooks-dlqstays empty, and alarms route to the on-call channel. Verify this end to end in staging before any production sends. - Alarm routing — CloudWatch alarm notifications reach the expected on-call destination.
Rollback Procedure
If a deployment causes delivery failures or data issues:- Pause campaign sends first if email delivery is affected, to avoid duplicate sends or corrupted recipient state.
- Revert the Lambda artifact — update
LambdaArtifactKeyin the parameter file to point to the previous release zip and redeploy the CloudFormation stack. - Revert the web host — point the
datamailer.serviceunit to the previous release artifact and restart:sudo systemctl restart datamailer caddy. - Disable failing Lambda event-source mappings if workers are causing retries that could worsen data state before the fix lands.
- Postgres restore — a database restore from an RDS snapshot is a human decision. Follow the Postgres restore drill in the operations runbook: restore to a new staging instance first, run
manage.py migrate --check, verify data, and record start/end time before considering a production restore.