Moving workloads to Azure to achieve cost savings and performance advances is something you may have heard Microsoft evangelizing in the last few months. Well, here at Cireson, we actually made the jump with our production System Center 2012 R2 Service Manager (SCSM) + Cireson environment. Yes, this is same one we use for all our internal and customer-facing support requests.
I wanted to share a few tips from our move to Azure for anyone considering the move to Azure and even those who have made the leap. Either way, our lessons from the field ensure you’re achieving maximum benefit for minimum expense. Today, I’ll share three tips that help deliver optimal performance at lesser expense than simply duplicating your on-premises data center strategy and accepting default Azure VM settings.
Tip #1: Right-size your VMs, avoid overallocation of VM resources
Azure is a pay-as-you-go environment, it never pays to allocate more than you need. Allocate what MS (or your 3rd party software vendor) recommends from the start. If experience shows you need more resources, shut down and choose a larger Azure VM image size in the VM properties. It only takes a few moments.
For a complete list of Azure VM sizes, check out the full list of VM image sizes at https://msdn.microsoft.com/en-us/library/azure/dn197896.aspx.
Tip #2: Whenever possible, choose scale out over scale up
When you move into a pay-as-you-go environment, there is one very imporant principle people tend to overlook, and that is the value of scale-out over scale-up. Some roles (like SQL servers) don’t necessarily have scale-out options, so this is not always possible. But in a few System Center scenarios, as with SCSM management servers, you have an opportunity to reduce risk, improve performance and save money all at the same time. Let’s look at a simple scenario.
For example, in an on-premises scenario, we might build a pair of big SCSM management servers with 32 GB of memory each, because we know from 8 – 10 am we are going to have a burst of traffic as analyst and support staff login to begin their day. On-premises, building a pair of big VMs like this makes sense. It ensures we have capacity and acceptable redundancy.
When we move to Azure, the equation changes. If we know we need 2 x 32 GB of ram at peak, but maybe only half of that off-peak and maybe only a quarter of that capacity overnight, we know have all the information we need to reduce our spend dramatically. Instead of building two big VMs, I can take that 64 GB ram requirement and cut it into 4 parts, and build 4 x 16 GB (A4 VM, Standard tier would closely match the need in this example) .
I can use the Azure Autoscale feature to watch CPU utilization and the request queue to turn on VMs 3 and 4 when I need them at peak time, and shut them down when the spike in demand passes. The cost savings there can be massive.
From a capacity perspective, I have reduced my risk at the same time! If I lose a single VM (due to app failure, reboot because of VM or Azure fabric patching, etc.), I only lose 25% of my total capacity instead of 50%. This increases my flexibility in when I can take corrective actions.
What’s more, load balancing these VMs in Azure is both easier and less expensive than in my on-premises data center. My NLB farm on-premises is replaced in Azure by an invisible logical construct in the form of an Azure load balancer, created in 4 clicks to form a load balanced set. This is great for your Service Manager management servers running the Cireson Portal. As Charlie Sheen would say – “winning”!
I described and illustrated this strategy in depth in a recent webinar. Go the recording at https://vimeo.com/120201818 and fast forward to the 34:50 (34 min, 50 sec) point.
Tip #3: Disable Geo-replication on your VM storage
Geo-replication is enabled for every VM you create by default, resulting in 3 copies of your VM disks in second (different) Azure regional data center. There are two reasons you generally do not want to leave this setting at its default.
First, this is typically simply unnecessary, because even with locally redundant storage (what you get when you disable geo-replication) gives 3 copies of the VM within the Azure regional data center. This is more than you have in your average on-premises environment. Couple this with a good backup strategy and your environment is protected and performance is equal or better…at a lower cost!
Secondly, you should never enable this feature for stripe sets as Azure cannot guarantee the write order, because the geo-replicated data is replicated asynchronously. This could result in data loss in the secondary data center.
You can read more about storage redundancy options at https://msdn.microsoft.com/en-us/library/azure/dn727290.aspx
The Cireson Azure Story
Chris Ross and I delivered a webcast a few days ago, walking through the process step-by-step, providing tips for maximizing performance and availability while minimizing costs along the way. Catch the replay below.