How To Monitor a Serverless Application

Compared to a typical web application deployed to EC2 a Serverless Application - consisting of an API Gateway and a Lambda function - needs less monitoring as you are outsourcing most of the operations to AWS. However, there are still some metrics you should keep an eye on. This article shows you how to monitor a Serverless Application with CloudWatch.

Typically a Serverless Application consists of an API Gateway forwarding incoming requests to Lambda. Lambda executes your business logic and makes use of S3 to store objects, DynamoDB to store and query data, and SES to send emails, for example. The following figure illustrates the architecture and shows which CloudWatch metrics need your attention.

How To Monitor a Serverless Application

You should create a CloudWatch dashboard showing all metrics and define CloudWatch alarms for all highlighted metrics.

Namespace Metric Name Description
AWS/ApiGateway 5XXError Number of requests with status code 5XX (server-side error).
AWS/ApiGateway Latency Time between incoming request and response on API Gateway.
AWS/Lambda Errors Number of failed function invocations (e.g. timeout, exception, …).
AWS/Lambda Throttles Number of throttled function invocations.
AWS/DynamoDB ReadThrottleEvents Number of throttled read requests.
AWS/DynamoDB WriteThrottleEvents Number of throttled read requests.
AWS/DynamoDB SystemErrors Number of server-side errors.
AWS/SES Reputation.BounceRate Percentage of bounced messages (multiply by 100).
AWS/SES Reputation.ComplaintRate Percentage of messages reported as spam (multiply by 100).

Start with the following configuration for your CloudWatch alarms. Don’t forget to refine the thresholds after a few days. To get notified about server-side errors typically resulting in error messages for your users you should create the following metric:

  • Metric namespace: AWS/ApiGateway
  • Metric name: 5XXError
  • Metric dimension: ApiName and optional Stage
  • Metric period: 60 seconds
  • Number of periods: 5 or 1 out of 5
  • Statistic: Sum
  • Alarm condition: > 1

Additionally, you should not miss when your users are experiencing long waiting times caused by high latencies of your Serverless Application.

  • Metric namespace: AWS/ApiGateway
  • Metric name: Latency
  • Metric dimension: ApiName and optional Stage
  • Metric period: 60 seconds
  • Number of periods: 5 or 1 out of 5
  • Statistic: p90, p95, or p95 (depending on number of requests)
  • Alarm condition: > 500 ms

You don’t need to create CloudWatch alarms for the metrics of Lambda, S3, and DynamoDB as problems with all of these components result in a 5XX error or high latencies at the API Gateway. Instead of creating CloudWatch alarms for these metrics put them on a CloudWatch dashboard to simplify investigating issues.

When sending emails with SES, you should create the following CloudWatch alarms to make sure you get notified when there is a problem with your reputation as a sender. Create the following alarm to get notified when the bounce rate is too high.

  • Metric namespace: AWS/SES
  • Metric name: Reputation.BounceRate
  • Metric dimension: depends on your event destination configuration
  • Metric period: 900 seconds
  • Number of periods: 5 or 1 out of 5
  • Statistic: Maximum
  • Alarm condition: > 0.05

Next, add one more alarm to get notified about spam complaints as well.

  • Metric namespace: AWS/SES
  • Metric name: Reputation.ComplaintRate
  • Metric dimension: depends on your event destination configuration
  • Metric period: 900 seconds
  • Number of periods: 5 or 1 out of 5
  • Statistic: Maximum
  • Alarm condition: > 0.05

That’s it. You are monitoring your Serverless Application closely and are ready to investigate potential issues.

Published on