350+ teams generate 2,500+ alerts every week.
Running EC2 instances inside an Auto Scaling Group fronted by a Load Balancer is a popular pattern on AWS. CloudWatch alarms can keep an eye on network usage, load balancer latency, HTTP 5XX errors, and much more.
RDS provides fully managed, relational databases engines on AWS. Define CloudWatch alarms to observe free storage space, available memory, and CPU usage to keep your database up and running.
Modern HTTP APIs can be build with Lambda and API Gateway. Both send metrics to CloudWatch. Define alarms to watch failed or throttled Lambda invocations, HTTP 5XX errors, or API Gateway latency.
You create a CloudWatch alarm to watch a metric. If the metric crosses a configured threshold, an alert is sent to me via HTTPS.
Once I receive an alert, I check if the alert is a duplicate to reduce noise. I also enrich the alert with Quick Links to the AWS Management Console. Finally, I send a Slack message with the alert to a single engineer of your team.
A member of your team now has to acknowledge the alert. I escalate unnoticed alerts to another team member or the whole crew if necessary. Once acknowledged, I will wait until you have fixed the issue.