With the pace of change in Microsoft Azure in the last 24 months and the volume of technical, marketing and other content related to Microsoft and the cloud, there is a lot of interesting information that gets lost in the shuffle. I wanted to share a few bits about the core and History of Microsoft Azure I think you’ll find interesting and that you’ve probably never heard. I’ll follow with links to a resource or two where you can hear a bit more of the back story, as well as a session we recently delivered on Cireson’s journey into Microsoft Azure.

1) What was the original Azure project name?

The original project for what ultimately became Microsoft Azure was a project code-named “Project Red Dog”.

2) Does Azure run on Windows?

The original Azure host operating system was a fork of the Windows OS called the ‘Red Dog OS’. Azure was pioneering functionality important to data centers everywhere. Running a fork of an OS is not ideal (in terms of the additional cost and complexity), so the Azure team talked to the Windows team. Windows eventually caught up and now Azure runs on Windows.

3) What does an Azure stamp look like?

While no doubt this info may change over time, Mark Russinovich mentioned in an April 2014 Azure Friday session some notes about Azure stamps. Each scale unit (stamp) includes 5 nodes that can serve as fabric controllers, with 1 acting as primary. This allows 2 hosts to fail and still maintain a quorum. There are many stamps in an Azure regional data center. This approach offering two very significant advantages. First, it makes scale-out very easy. Second, it reduces the potential scope of impact of an issue (the “blast radius” as Mr. Russinovich calls it), as in the Leap Day bug back in 2012.

4) How does Microsoft patch Azure hosts? 

Azure hosts are image-based (hosts boot from VHD). This offers a major advantage in host maintenance as the volume itself can be replaced, enabling quick rollback. Host updates role out every few weeks (4-6 weeks), with an approach where updates are well-tested before they are rolled out broadly to the data centers.

5) How does Azure monitor health?

There were a few interesting facts shared by Mark Russinovich in this regard:

  • The service health polling mechanisms in Azure have a very aggressive interval of 15 seconds, enabling rapid mean-time-to-detection (MTTD) and mean-time-to-recovery (MTTR).
  • There are layer of monitors from the fabric on up to the endpoints. A controller monitors host health at host agent level, and the host agent polls the guest agents in Azure VMs.
  • Self-healing actions have a max count threshold within a given window of time to avoid continuous retries that cascade throughout a data center, touching many hosts.

Background Info

Most of the info above actually came straight from a short session delivered by two of my favorite Microsoft experts, Scott Hanselman and Mark Russinovich in a short Azure Friday session from mid-2014. Catch the episode at http://azure.microsoft.com/en-gb/documentation/videos/mark-russinovich-windows-on-azure/.

The Cireson Azure Story

Chris Ross and I delivered a webcast a few days ago, walking through the process step-by-step, providing tips for maximizing performance and availability while minimizing costs along the way. Catch the replay below.

SERVICE MANAGER + CIRESON + AZURE from Team Cireson on Vimeo.