RDS Performance Insights: monitor and debug database performance

Michael Wittig – 07 Sep 2020

When users complain about a web or enterprise application’s slow response times, the root cause is often high load on the database. Usually, a high load is caused by an increased number of queries or many costly database queries.

Amazon RDS is one of the most valuable services for classic three-tier architectures. The fully managed service covers most of the tricky parts of operating a MySQL/PostgreSQL/MS SQL/Oracle database - replication to a 2nd data center, snapshots, and patch management, for example.

However, RDS is a black box. That’s fine from the one hand but makes it challenging to debug performance problems on the other hand. The optional feature called Performance Insights tackles that problem.

RDS Performance Insights: Overview

What is RDS Performance Insights?

RDS Performance Insights is an optional feature that you can enable for most types of RDS databases.¹ The goals of the tool are:

Debug performance issues, for example, by identifying queries that are causing high load.
Tuning performance by identifying bottlenecks like CPU or I/O throughput.
Monitor the performance by aggregating the core performance indicators of your database.

After enabling RDS Performance Insights, data is recorded in the background. There are three ways to access performance data:

AWS Management Console
REST API³
CloudWatch metrics²

The following screenshot shows RDS Performance Insights in the AWS Management Console.

RDS Performance Insights: Overview

Pricing

The basic version of RDS Performance Insights is free. The free tier includes seven days of performance data history and one million API requests per month. Therefore, you should enable RDS Performance Insights by default. Keep in mind that not all database engines support RDS Performance Insights for the small instance types.¹

Enabling long term data retention (up to two years) is quite expensive. Monthly costs vary from $5 to $497, depending on the instance type.⁴

Counter Metrics

RDS Performance Insights records counter metrics helping you to understand what is going on in your database. The tracked metrics differ between the different database engine types. A few examples for a MySQL database:

Metric	Description
os.cpuUtilization.total	The total percentage of the CPU in use. This value includes the nice value.
os.diskIO.await	The number of milliseconds required to respond to requests, including queue time and service time.
db.SQL.Slow_queries	The number of queries that have taken more than `long_query_time` seconds.
db.SQL.Innodb_rows_inserted	The number of rows inserted into InnoDB tables.
db.SQL.Innodb_rows_read	The number of rows read from InnoDB tables.
db.SQL.Innodb_rows_updated	The number of rows updated in InnoDB tables.
db.SQL.Innodb_rows_deleted	The number of rows deleted from InnoDB tables.

Open RDS Performance Insights in the AWS Management Console and use the button Manage metrics to select the counter metrics you are interested in.

RDS Performance Insights: Counter Metrics

Check out Customizing the Performance Insights Dashboard to learn about all the available counter metrics. The counter metrics are accessible via the AWS Management Console and the REST API.

Analyze Queries

Besides that, RDS Performance Insights shows the metrics with the highest impact on the database load. This overview allows you to indicate expensive queries. Modifying the data schema, adding indexes, or reworking the queries could solve your performance issues.

RDS Performance Insights: Queries

Analyze Bottlenecks

On top of that, RDS Performance Insights enables you to identify resource bottlenecks. For example, a database waiting for I/O a lot would benefit from increasing the maximum I/O throughput by switching from gp2to io1 storage.

RDS Performance Insights: Wait

Monitoring

RDS Performance Insights publishes data to CloudWatch as well. Those metrics can be used to monitor the load of your database with the help of CloudWatch alarms.

DBLoad the number of average active sessions.
DBLoadCPU the number of average active sessions waiting for the CPU.
DBLoadNonCPU the number of average active sessions waiting for something else than the CPU.

Whenever the number of average active sessions exceeds the number of vCPUs of your database instance, the connections compete for resources, which increases query latencies.

Therefore, I do recommend to create a CloudWatch alarm to monitor the DBLoad metric.

Enable RDS Performance Insights for your database.
Add marbot to Slack or Microsoft Teams.
Invite marbot to a channel.
Send @marbot monitor rds to the channel.
marbot will help you to configure CloudWatch alarms with the help of CloudFormation.
When creating the CloudFormation stack …
1. Set the DBInstanceIdentifier parameter to the identifier of your RDS instance.
2. Make sure to set the DBLoadThreshold to the number of vCPUs of your database instance.
3. Review the threshold parameters.
4. Save by clicking the Create button.
marbot will notify you about issues with your RDS database.

Summary

If you haven’t done so already, you should enable RDS Performance Insights for your RDS database instances. The free version with a retention period of 7 days is sufficient for most scenarios.

Michael Wittig

Consultant focusing on Amazon Web Services (AWS). Entrepreneur building marbot.io. Author of Amazon Web Services in Action, Rapid Docker on AWS, and cloudonaut.io.

You can contact me via Email, Twitter, and LinkedIn.

Published on 07 Sep 2020

Chatbot for AWS Monitoring

Configure monitoring for Amazon Web Services: CloudWatch, EC2, RDS, EB, Lambda, and more. Receive and manage alerts via Slack. Solve incidents as a team.

Add to Slack

Add to Teams