Monitor Elasticsearch with CloudWatch metrics, alarms and EventBridge
Michael Wittig – 30 Jan 2018 (updated 17 Aug 2021)
The search layer is an important component of each system that needs monitoring. Amazon Elasticsearch Service provides Elasticsearch as a Service. Monitoring Elasticsearch is more involved compared to other AWS services. But we have you covered and provide all the details!
Each domain sends metrics to CloudWatch that we can monitor with CloudWatch alarms. We recommend creating alarms for the following metrics:
ClusterStatus.yellow: A value of 1 indicates that the primary shards for all indices are allocated to nodes in the cluster, but replica shards for at least one index are not.
ClusterStatus.red: A value of 1 indicates that the primary and replica shards for at least one index are not allocated to nodes in the cluster.
CPUUtilization: The percentage of CPU usage for data nodes in the cluster.
CPUCreditBalance: The remaining CPU credits available for data nodes in the cluster (only for the t* instance types).
MasterCPUUtilization: The maximum percentage of CPU resources used by the dedicated master nodes.
MasterCPUCreditBalance: The remaining CPU credits available for dedicated master nodes in the cluster (only for the t* instance types).
FreeStorageSpace: The free space for data nodes in the cluster.
ClusterIndexWritesBlocked: Indicates whether your cluster is accepting or blocking incoming write requests. A value of 1 means that it is blocking requests.
JVMMemoryPressure: The maximum percentage of the Java heap used for all data nodes in the cluster.
MasterJVMMemoryPressure: The maximum percentage of the Java heap used for all dedicated master nodes in the cluster.
MasterReachableFromNode: A health check for
exceptions. A value of 0 indicates that/_cluster/health/` is failing.
AutomatedSnapshotFailure: The number of failed automated snapshots for the cluster.
KibanaHealthyNodes: A health check for Kibana.
KMSKeyError: A value of 1 indicates that the AWS KMS key used to encrypt data at rest has been disabled.
KMSKeyInaccessible: A value of 1 indicates that the AWS KMS key used to encrypt data at rest has been deleted or revoked its grants to Amazon ES.
You can subscribe to the
Amazon ES Service Software Update Notification event type by using EventBridge to receive notifications about upcoming, running, failed, or completed updates to your ES domains.
Amazon ES Auto-Tune Notification event type informs you about activity by Auto-Tune.
Learn more about Monitoring Elasticsearch events with EventBridge.
Monitor Elasticsearch with CloudWatch and EventBridge. Receive alerts in Slack or Microsoft Teams!
Take your AWS monitoring to a new level! Chatbot for AWS Monitoring: Configure monitoring, escalate alerts, solve incidents.