Monitor VPC NAT gateways with CloudWatch metrics and alarms
Michael Wittig – 15 Aug 2022
Many VPC designs make use of public and private subnets. You need a NAT gateway to communicate from a private subnet with the Internet.
A VPC NAT gateway is a finite resource that can be exhausted. That’s why you need to add monitoring to be alerted if the NAT gateway gets a bottleneck.
CloudWatch metrics
Each NAT gateway sends metrics to CloudWatch that we can monitor with CloudWatch alarms. We recommend creating alarms for the following metrics:
ErrorPortAllocation
: The number of times the NAT gateway could not allocate a source port.PacketsDropCount
: The number of packets dropped by the NAT gateway.
Monitoring throughput utilization
Unfortunately, NAT gateways do not report a single metric on the throughput utilization of bandwidth and packets. The maximum bandwidth is 100 Gbit/second and 10,000,000 packets/second. Luckily, we can calculate throughput by using CloudWatch metric math.
To calculate the bandwidth utilization, we use the following metrics:
ID | metric | statistic | period |
---|---|---|---|
in1 | BytesInFromDestination | Sum | 60 |
in2 | BytesInFromSource | Sum | 60 |
out1 | BytesOutToDestination | Sum | 60 |
out2 | BytesOutToSource | Sum | 60 |
And the following expressions:
ID | expression | comment |
---|---|---|
bandwidth | (in1+in2+out1+out2)/60*8/1000/1000/1000 | Bytes/min to Gbit/s |
utilization | bandwidth/100*100 | to %; 100 Gbit/s is the hard limit |
Set up instructions
Monitoring Assistant
CloudWatch metric math sounds complicated? We have you covered! Monitor NAT gateways and receive alerts in Slack or Microsoft Teams!It couldn't be easier!
- Add marbot to Slack or Microsoft Teams.
- Invite marbot to a channel.
- Follow the setup wizard.
Take your AWS monitoring to a new level! Chatbot for AWS Monitoring: Configure monitoring, escalate alerts, solve incidents.