How to fix delayed alarms?

Andreas Wittig – 24 Sep 2020

Some of our integrations depend on Amazon SNS. For example, our jump start templates configure SNS topics and subscriptions within your AWS account. Unfortunately, there is an issue with SNS: messages to HTTPS endpoints are delayed by more than 30 minutes under certain circumstances. Therefore, a CloudWatch alarm from your AWS infrastructure could appear on Slack or Microsoft Teams with a significant delay.

How to fix delayed alarms?

We informed AWS about that problem on September 1st. Unfortunately, AWS did not fix the problem for 24 days. And even worse, AWS does not share an ETA for resolving the issue. Therefore, we have decided to roll out a workaround.

Want to learn more about the issue with SNS? I’ve written a blog post about our experience with delayed SNS messages.

Jump Starts

In case you are using one of our monitoring jump starts, you will receive a message from @marbot with instructions to update CloudFormation stacks or Terraform modules soon. If you don’t want to wait for the update message:

Other SNS topics and subscriptions

Did you create SNS topics and subscriptions for the use with marbot manually or using your infrastructure as code templates? Please make sure to remove the optional throttling configuration from your SNS topics and subscriptions.

The following screenshot illustrates how to remove the throttling configuration.

Workaround: modify the delivery retry policy

The following snippet shows the old delivery retry policy.

{
"healthyRetryPolicy": {
"minDelayTarget": 1,
"maxDelayTarget": 60,
"numRetries": 100,
"numMaxDelayRetries": null,
"numNoDelayRetries": 0,
"numMinDelayRetries": null,
"backoffFunction": "exponential"
},
"sicklyRetryPolicy": null,
"throttlePolicy": {
"maxReceivesPerSecond": 1
},
"guaranteed": false
}

Remove the throttlePolicy attribute from the policy, resulting in the following JSON object, for example.

{
"healthyRetryPolicy": {
"minDelayTarget": 1,
"maxDelayTarget": 60,
"numRetries": 100,
"numMaxDelayRetries": null,
"numNoDelayRetries": 0,
"numMinDelayRetries": null,
"backoffFunction": "exponential"
},
"sicklyRetryPolicy": null,
"guaranteed": false
}

In case you are using CloudFormation, Terraform, or any other tools to provision your infrastructure, the approach is the same. In case you specified a delivery retry policy for the SNS topic or subscription, make sure to remove the throttlePolicy attribute.

Questions

We are sorry about the trouble caused by this issue. Do you have any questions? Please send a message to hello@marbot.io. We are happy to help!

Andreas Wittig

Andreas Wittig

Consultant focusing on Amazon Web Services (AWS). Entrepreneur building marbot.io. Author of Amazon Web Services in Action, Rapid Docker on AWS, and cloudonaut.io.

You can contact me via Email, Twitter, and LinkedIn.

Published on

marbot teaser

Chatbot for AWS Monitoring

Configure monitoring for Amazon Web Services: CloudWatch, EC2, RDS, EB, Lambda, and more. Receive and manage alerts via Slack. Solve incidents as a team.

Slack
Add to Slack
Microsoft Teams
Add to Teams