Sunday 28 February 2021

Align your managed disk to Microsoft standard tiering to optimise your spending

“Managed” disk has been around in Azure for a few years now and is the standard for deploying with VMs. I started deploying VMs when “unmanaged” disk was the standard and you had to plan carefully how many disk/s, IOPS you needed and strategically place those disks in the right storage account to ensure that it can deliver those performances. We were using spreadsheets to help us track which disk was in which storage account and making sure we weren’t hitting the max IOPS limits that the storage account could delivery. Over time it just got very very messy and complex to manage. 

With the introduction of “managed” disk it took all those pains away and Azure was dealing with them in the backend. We could change from SSD to HDD and vice versa easily, snapshots were just a few clicks, we didn’t need to search in storage accounts for orphaned disk where we were paying for because we forgot to decommission them as part of the VM. With managed disk we just had to work out how much disk space we needed, the amount of IOPS needed and then select the disk that matches it as closely as you can to your requirement. 

Previously for standard HDD, Azure would be charging you based on the amount of data you were writing to disk and not the actual size you have assigned to the disk. For example, if you had a 100GiB disk and had only 40GiB data written then you would be charged for just 40GiB. If your disk was SSD then you would be charged overall on the disk size you have assigned, so with the example above you would be charged 100GiB even thought you have written just 40GiB of data. So, if you were using standard HDD this pricing method worked well as you could provision the disk size to be quite big and not worry about the cost just like how you would do it on premise on systems like VMware. With SSD you had to be careful as it was size dependent and not based on consumption of space.

Microsoft have moved to a standard tiering cost for a while now across all type of disk that they offer. Below is a sample of the fixed disk sizes and the tiering name for each type of managed disk. 

From the table you can see that disk sizes are not the usual sizes that we might assign to our VM (10GiB ,50GiB ,100GiB, etc) but they do seem to align to the physical disk sizes that you would normally buy for SSD especially from sizes 128GiB sizes upwards which maybe a coincident. With this standard tiering you would assume that if you were to create a new disk that you wouldn’t have a choice to create your own custom size but you are allowed to. I am assuming Azure allows you to do it because there could be some systems that have to be specific disk sizes? 

Microsoft highlights “Azure Managed Disks are priced to the closest tier that accommodates the specific disk size and are billed on an hourly basis.”. As an example, if your SSD had 12GiB data written and your disk size was 20GiB then you would be charged at 32GiB.

I have included the cost of 16GiB as my example stated that it had only 12GiB data written which I wanted to highlight the cost difference if the disk was sized to 16GiB instead. You can see that you would be paying around 54% more per month if stuck with the disk size being 20GiB. Imagine we had 100 of these what would the cost be and how much we would save per year?

From the table you can see that we could save around £2,210 if we moved to the 16GiB which is quite a substantial amount over the year.

So why did I bring this topic up?

I started my journey in to Azure on a project by performing a “lift and shift” phase where we were just migrating our existing workload into Azure. As it is the first phase you would want to match the performance for your on-premise VM to be the same as in Azure such as CPU, Memory, Disk Size, Disk IOPS. During this phase if you couldn’t find the right size match then you would most likely size up. As I stated before when I first started this journey managed disk was not introduced so we sized disk exactly as what they were on-premise. We didn’t want to do any optimisation at this stage as we wanted users and key business stakeholders to build up their confidence with cloud.

As most of our workload was migrated from on-premise, we were always “over-provisioning” the disk size as we knew that our storage systems (SAN/NAS) would do it’s magic like thin-provisioning, deduplication, compression to ensure that we got value from our SAN/NAS to store more data. We would never really need to think about performance as overall the system would balance out over time with the type of workload that we were running. As we purchased the system already, the more we can utilise the storage the better return of investment this becomes and as the cost is shared across all the application/systems this will get cheaper over time.

We started to work through our disks sizes with service/application owners to see whether the migrated disk could be aligned to those particular tiering Microsoft was offering to try and save some money or in other words optimise our spend and add value for money. You could see from the above example where a disk has been sized as 20GiB but using just 12GiB of space could be sized at the P3 (16GiB). Obviously, there are some key factors we have to take into consideration before just steam rolling ahead with the disk work:
  • There is no way to be able to resize a disk to be smaller without using a third-party tool so you would need a longer downtime to perform this action and possibly purchasing a tool. Another option would be to provision a new disk to the size you want and copy the data over but again this would require a longer downtime. With OS disk this get more complicated so you really have to think if it is worth the effort especially if the server will be decommissioned soon.
  • Although you might want to size down the disk but each disk size offering has a maximum IOPS and throughput it can deliver so you will need to review the performance metrics of the disk to ensure that sizing down won’t hamper the performance of the disk for the server. If you were cross charging departments, they will need to understand why you sized a disk at 512GiB instead of 256GiB as requested by them. They need to understand that only 512GiB disk could deliver the required IOPS or throughput.
  • Size up but why? for example a disk was sized at 100GiB and had already used around 80GiB, with Microsoft statement “Azure Managed Disks are priced to the closest tier that accommodates the specific disk size and are billed on an hourly basis” it meant we were already paying the high tier(128GiB) so why not just make the disk space available for the VM to use as you are already paying for it
  • Most of the disk sizes provisioned on-premise was sized for growth over a period of 3-5 years. As we were resizing some of the disk smaller the show back cost appeared lower and departments were thinking great, we saving money BUT you need to let them know that the cost will go up as the data grows and the disk size will change. You are only trying to optimise the overall cost of running the server over a period of time. You need to reminder them that they can’t cut the budget as it will be needed eventually if the data does grow!!
By aligning to Microsoft size tiering, we were able to start to standardise our disk offering and be able to give costing more easily and accurately. Service/Application owners were starting to think more carefully on the disk size requirements and IOPS/throughput as it now affected the overall cost of their server over x number of years. Any disk size increases were always doubled checked to ensure that anything that could be deleted should be deleted first before increasing. 

One thing to be clear I think is that you are looking to optimise your spent and that there is no guarantee that you will save money. What you’re hoping to get out at the end of this exercise is that you are not overspending when you either don’t need the capacity or performance of the disk from day one. A simple example would be, a system expects year on year-on-year growth data over the next five years and by year 5 it will be 256GiB. On a traditional SAN/NAS system you may have “thin provisioned” the disk so that you have set a maximum that you will allow it to grow and when you come to upgrade/refresh the SAN/NAS you know that you need to buy capacity to cater for that growth. But for cloud if we were to provisioned that space upfront then we end up paying a lot for capacity that we have no use.  

As you can see from the table if I was to go up a disk tier each year against provisioning the final state at day 0 you would save around £1,019.52 over 5 years.
Go ahead and have a look at your disk sizes today and see if you can optimise your spending in Azure. With Microsoft now starting to offer 1TB+ disk reservation I don’t think it will be long where smaller disk sizes will have disk reservation too. 

No comments:

Post a Comment

New Azure KMS IP and domain Addresses for activation

For Windows virtual machines deployed into Azure using marketplace images you may have created rules in your NSG or firewalls to allow the s...