How To Monitor a Serverless Application

Andreas Wittig – 17 Aug 2018

Compared to a typical web application deployed to EC2 a Serverless Application - consisting of an API Gateway and a Lambda function - needs less monitoring as you are outsourcing most of the operations to AWS. However, there are still some metrics you should keep an eye on. This article shows you how to monitor a Serverless Application with CloudWatch.

Typically a Serverless Application consists of an API Gateway forwarding incoming requests to Lambda. Lambda executes your business logic and makes use of S3 to store objects, DynamoDB to store and query data, and SES to send emails, for example. The following figure illustrates the architecture and shows which CloudWatch metrics need your attention.

How To Monitor a Serverless Application

You should create a CloudWatch dashboard showing all metrics and define CloudWatch alarms for all highlighted metrics.

Namespace Metric Name Description
AWS/ApiGateway 5XXError Number of requests with status code 5XX (server-side error).
AWS/ApiGateway Latency Time between incoming request and response on API Gateway.
AWS/Lambda Errors Number of failed function invocations (e.g. timeout, exception, …).
AWS/Lambda Throttles Number of throttled function invocations.
AWS/DynamoDB ReadThrottleEvents Number of throttled read requests.
AWS/DynamoDB WriteThrottleEvents Number of throttled read requests.
AWS/DynamoDB SystemErrors Number of server-side errors.
AWS/SES Reputation.BounceRate Percentage of bounced messages (multiply by 100).
AWS/SES Reputation.ComplaintRate Percentage of messages reported as spam (multiply by 100).

Start with the following configuration for your CloudWatch alarms. Don’t forget to refine the thresholds after a few days. To get notified about server-side errors typically resulting in error messages for your users you should create the following metric:

  • Metric namespace: AWS/ApiGateway
  • Metric name: 5XXError
  • Metric dimension: ApiName and optional Stage
  • Metric period: 60 seconds
  • Number of periods: 5 or 1 out of 5
  • Statistic: Sum
  • Alarm condition: > 1

Additionally, you should not miss when your users are experiencing long waiting times caused by high latencies of your Serverless Application.

  • Metric namespace: AWS/ApiGateway
  • Metric name: Latency
  • Metric dimension: ApiName and optional Stage
  • Metric period: 60 seconds
  • Number of periods: 5 or 1 out of 5
  • Statistic: p90, p95, or p95 (depending on number of requests)
  • Alarm condition: > 500 ms

You don’t need to create CloudWatch alarms for the metrics of Lambda, S3, and DynamoDB as problems with all of these components result in a 5XX error or high latencies at the API Gateway. Instead of creating CloudWatch alarms for these metrics put them on a CloudWatch dashboard to simplify investigating issues.

When sending emails with SES, you should create the following CloudWatch alarms to make sure you get notified when there is a problem with your reputation as a sender. Create the following alarm to get notified when the bounce rate is too high.

  • Metric namespace: AWS/SES
  • Metric name: Reputation.BounceRate
  • Metric dimension: depends on your event destination configuration
  • Metric period: 900 seconds
  • Number of periods: 5 or 1 out of 5
  • Statistic: Maximum
  • Alarm condition: > 0.05

Next, add one more alarm to get notified about spam complaints as well.

  • Metric namespace: AWS/SES
  • Metric name: Reputation.ComplaintRate
  • Metric dimension: depends on your event destination configuration
  • Metric period: 900 seconds
  • Number of periods: 5 or 1 out of 5
  • Statistic: Maximum
  • Alarm condition: > 0.05

That’s it. You are monitoring your Serverless Application closely and are ready to investigate potential issues.

Andreas Wittig

Andreas Wittig

I’m the author of Amazon Web Services in Action. I work as a software engineer, and independent consultant focused on AWS and DevOps.

You can contact me via Email, Twitter, and LinkedIn.

Briefcase icon
Hire me
marbot teaser

Incident Management for Slack

Team up to solve incidents with marbot. Never miss a critical alert. Escalate alerts from your AWS infrastructure among your team members. Strong integrations with all parts of your AWS infrastructure: CloudWatch, Elastic Beanstalk, RDS, EC2, ...

Slack icon
Try for free