Monday 7 December 2020

Azure Service Health Alerts

One of the first things to do once you have a Azure subscription created is to set up a service health alert. These alerts will tell you if Microsoft are doing any maintenance or having trouble with a particular service. These alerts are not designed to tell you specifically if your resources will be impacted but more as a overall health status for services so that you are aware that there are issues on the platform.

Setting up ...

First login to your subscription via https://portal.azure.com. Once you have logged in then use the search bar to locate "Service Health"

You will be taken to the Service Health blade which will show you any current issues within Azure. From here you can drill down to the data centres on the world map or see what issues that have been trigger in the last seven days. To add a service health click on "Add service health alert".

You will be taken to the "create alert rule" page. 

The rule can only target one subscription so under "subscription" select which subscription you would like to create this rule in.
Next select what "Services" you would like to be alerted on. I decided that I only want to get alerts on virtual machines as that's all I have in the subscription.
As my virtual machines are only in North Europe so I have just selected "North Europe".
For service health criteria you have four possible event type and we are interested in two of them which is "Service issue" and "Planned maintenance".
The next section is what would action would you like Azure to take when it triggers this alert rule. I am not going to go in to how to create the action group or what options you have. This article will guide you through it (https://docs.microsoft.com/en-gb/azure/azure-monitor/platform/action-groups?WT.mc_id=Portal-Microsoft_Azure_Monitoring)

Click on "Select Action Group".

Select the action group to associate with this alert rule and click "select".
Give your rule a name, any description that you want to add. Select which resource group you wish to save this rule to. Ensure "Enabled alert rule upon creation" box is checked.

Once you double checked that all the settings are how you want them to be then click on "Create alert rule".

Once the rule is created, on the Service Health blade select "Health alerts" and drill down to your subscription and you should see the alert we have just created. You will see the "details" of the alert, the settings you have defined and if you click on "history" you will see when this alert was last fired out as well.

Below is a sample alert that you may receive which you can see that it is notifying us that there is a problem with Log Analytics and Application Insights in North Europe. A tracking ID has been assigned by Microsoft which you can search for within the portal to get further updates on it.



Examples of how I have used the alerts, my team and BCDR team receives all the service outage alerts so that we can see what the trend could. It could be that we are seeing a particular services failing globally affecting one location/region slowly at a time. We can then antipcate or start to see what we can do to prevent our services from being impacted. ie start to do failovers or contact our users to let them know we could be impacted or go to manual procedures etc. We also set up a separate alert which monitors the outage/maintenance where we have deployed our resources. When we receive alerts from this rule then we would notify support teams where their resources have been deployed in those specific locations to let them know that they could be affected.

As you can see setting this up will be useful for you to receive alerts if there are any outages or maintenance planned by Microsoft which could affect the services you have deployed. Just remember to select on services and regions where you have deployed your resources otherwise you may miss an alert. 

No comments:

Post a Comment

New Azure KMS IP and domain Addresses for activation

For Windows virtual machines deployed into Azure using marketplace images you may have created rules in your NSG or firewalls to allow the s...