How to fix delayed alarms?
Andreas Wittig – 24 Sep 2020
Some of our integrations depend on Amazon SNS. For example, our CloudFormation templates configure SNS topics and subscriptions within your AWS account. Unfortunately, there is an issue with SNS: messages to HTTPS endpoints are delayed by more than 30 minutes under certain circumstances. Therefore, a CloudWatch alarm from your AWS infrastructure could appear on Slack or Microsoft Teams with a significant delay.
We informed AWS about that problem on September 1st. Unfortunately, AWS did not fix the problem for 24 days. And even worse, AWS does not share an ETA for resolving the issue. Therefore, we have decided to roll out a workaround.
Want to learn more about the issue with SNS? I’ve written a blog post about our experience with delayed SNS messages.
In case you are using our CloudFormation templates or Terraform modules, you will receive a message from
@marbot with instructions to update CloudFormation stacks or Terraform modules soon. If you don’t want to wait for the update message:
Did you create SNS topics and subscriptions for the use with marbot manually or using your infrastructure as code templates? Please make sure to remove the optional throttling configuration from your SNS topics and subscriptions.
The following screenshot illustrates how to remove the throttling configuration.
The following snippet shows the old delivery retry policy.
throttlePolicy attribute from the policy, resulting in the following JSON object, for example.
In case you are using CloudFormation, Terraform, or any other tools to provision your infrastructure, the approach is the same. In case you specified a delivery retry policy for the SNS topic or subscription, make sure to remove the
We are sorry about the trouble caused by this issue. Do you have any questions? Please send a message to email@example.com. We are happy to help!
Take your AWS monitoring to a new level! Chatbot for AWS Monitoring: Configure monitoring, escalate alerts, solve incidents.