GSM Infrastructure exposes several monitoring surfaces out of the box: CloudWatch Log Groups collect structured output from all ECS containers and scheduler Lambda functions, ECS service health checks continuously validate each backend endpoint, a deployment circuit breaker automatically rolls back failed deploys, and an AWS Budget alert notifies you by email when monthly spend crosses your configured threshold. Together these mechanisms give you end-to-end visibility into both application health and infrastructure cost without requiring any additional tooling.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ti-infinite/GSMInfrastructure/llms.txt
Use this file to discover all available pages before exploring further.
CloudWatch Log Groups
Three log groups are created by the CloudFormation stacks. Retention policies are set at the stack level to control storage costs./ecs/{env}-{appName}-backend
Retention: 7 days — Aggregated logs for all four ECS services running on the EC2 instance. Each container writes to this shared group using a unique stream prefix so you can filter by service:
Filter example in CloudWatch Logs Insights:
| Stream prefix | Service |
|---|---|
gmsgateway | API Gateway container (port 80) |
gmsauth | Authentication service (port 8081) |
gsmapplication | Application service (port 8082) |
gsmoperations | Operations service (port 8083) |
/aws/lambda/{env}-{appName}-ec2-stop
Retention: 14 days — Execution logs for the Stop Lambda. Each invocation logs ECS scale-down confirmations, EIP disassociation status, and the EC2 stop command result. Check this group whenever you suspect the nightly stop did not complete successfully.
/aws/lambda/{env}-{appName}-ec2-start
Retention: 14 days — Execution logs for the Start Lambda. Each invocation logs EC2 start progress (including the
instance_running waiter), EIP reassociation, and ECS scale-up confirmations.ECS Service Health Checks
Each container in the ECS task definitions includes a Docker-level health check that polls the service’s own health endpoint. ECS uses this check to determine whether a task is healthy before routing traffic to it and before marking a deployment as successful. All four services share the same timing parameters:| Parameter | Value |
|---|---|
Interval | 30 seconds |
Timeout | 10 seconds |
Retries | 3 |
StartPeriod | 120 seconds |
StartPeriod gives each container 120 seconds to initialize before health check failures begin counting against the retry limit. This accommodates JVM warm-up or database connection pool initialization.
| Service | Health check command |
|---|---|
gsmgateway | wget -qO- http://localhost:80/api/health || exit 1 |
gmsauth | wget -qO- http://localhost:8081/health || exit 1 |
gsmapplication | wget -qO- http://localhost:8082/health || exit 1 |
gsmoperations | wget -qO- http://localhost:8083/health || exit 1 |
The
HealthCheckGracePeriodSeconds: 60 setting on each ECS service gives the container an additional 60-second grace period at the service level (separate from the container-level StartPeriod) before ECS considers health check results when deciding whether to replace tasks. This prevents premature task replacement during deployment rollouts.ECS Deployment Circuit Breaker
All four ECS services enable the deployment circuit breaker with automatic rollback:SERVICE_DEPLOYMENT_FAILED event in the ECS service event stream, and the service will return to the last known healthy state without manual intervention.
Check the ECS service events
In the AWS Console, navigate to ECS → Clusters → → Services → → Events tab to see a timestamped list of deployment and health check events.
Inspect the failed task logs
Go to CloudWatch → Log groups → /ecs/--backend and filter by the stream prefix for the affected service. Look for startup errors or connection failures during the deployment window.
AWS Budget Alerts
A monthly COST budget is provisioned by the infrastructure stack and sends an SNS email alert when actual spend crosses 100% of theBudgetLimitUSD threshold (default: $30 USD).
| Budget attribute | Value |
|---|---|
| Budget name | {env}-{appName}-monthly-budget |
| Type | COST |
| Period | MONTHLY |
| Notification type | ACTUAL |
| Alert threshold | 100% of BudgetLimitUSD |
| Notification channel | SNS email ({env}-{appName}-notification-alerts) |
EC2 Instance Monitoring
The EC2 instance is enrolled in AWS Systems Manager (SSM) via theAmazonSSMManagedInstanceCore managed policy attached to its IAM instance role. This enables:
- Session Manager — open an interactive shell session to the instance from the AWS Console or CLI without opening SSH port 22 or managing key pairs for day-to-day access.
- SSM Parameter Store access — the instance role also includes a scoped
SSMParameterStoreReadinline policy that allowsssm:GetParameter,ssm:GetParameters,ssm:GetParameterHistory,ssm:GetParametersByPath, andkms:Decryptfor the/{env}/*parameter path.
The EC2 instance lives in a private subnet with no inbound SSH rule in the security group. Session Manager is the recommended way to access the instance for debugging. If you need to use the EC2 key pair (stored as
Ec2PenKeyName) for emergency access, you will need a bastion host or AWS EC2 Instance Connect Endpoint in the same VPC.