Redundancy in the Cloud

Strange as it might be to contemplate, but imagine what would happen if AWS went into receivership and was shut down to liquidate assets? What would that mean for your infrastructure? Project? Or even startup?

It would be pretty bad.

Startups have been deploying preferentially on AWS or other Cloud services for some time now, in part due to venture-capitalist push to not have physical infrastructure to liquidate should the startup go *pop* and to scale fast should a much desired rocket-launch happen. If AWS shut down fully for, say, a week, the impact to pretty much everything would be tremendous.

Or what if it was Azure? Fully debilitating for those that are on it, but the wide impacts would be less.

Cloud vendors are big things. In the old physical days we used to deal with the all-our-eggs-in-one-basket problem by putting eggs in multiple places. If you're on AWS, Amazon is very big about making sure you deploy across multiple Availability Zones and helping you become multi-region in the process if that's important to you. See? More than one basket for your eggs. I have to presume Azure and the others are similar, since I haven't used them.

Do you put your product on multiple cloud-vendors as your more-than-one-basket approach?

It isn't as easy as it was with datacenters, that's for sure.

This approach can work if you treat the Cloud vendors as nothing but Virtualization and block-storage vendors. The multiple-datacenter approach worked in large part because colos sell only a few things that impact the technology (power, space, network connectivity, physical access controls), though pricing and policies may differ wildly. Cloud vendors are not like that, they differentiate in areas that are technically relevant.

Do you deploy your own MySQL servers, or do you use RDS?
Do you deploy your now MongoDB servers, or do you use DynamoDB?
Do you deploy your own CDN, or do you use CloudFront?
Do you deploy your own Redis group, or do you use SQS?
Do you deploy your own Chef, or do you use OpsWorks?

The deeper down the hole of Managed Services you dive, and Amazon is very invested in pushing people to use them, the harder it is to take your toys and go elsewhere. Or run your toys on multiple Cloud infrastructures. Azure and the other vendors are building up their own managed service offerings because AWS is successfully differentiating from everyone else by having the widest offering. The end-game here is to have enough managed services offerings that virtual private servers don't need to be used at all.

Deploying your product on multiple cloud vendors requires either eschewing managed-services entirely, or accepting greater management overhead due to very significant differences in how certain parts of your stack are managed. Cloud vendors are very much Infrastructure-as-Code, and deploying on both AWS and Azure is like deploying the same application in Java and .NET; it takes a lot of work, the dialect differences can be insurmountable, and the expertise required means different people are going to be working on each environment which creates organizational challenges. Deploying on multiple cloud-vendors is far harder than deploying in multiple physical datacenters, and this is very much intentional.

It can be done, it just takes drive.

  • New features will be deployed on one infrastructure before the others, and the others will follow on as the integration teams figure out how to port it.
  • Some features may only ever live on one infrastructure as they're not deemed important enough to go to all of the effort to port to another infrastructure. Even if policy says everything must be multi-infrastructure, because that's how people work.
  • The extra overhead of running in multiple infrastructures is guaranteed to become a target during cost-cutting drives.

The ChannelRegister article's assertion that AWS is now in "too big to fail" territory, and thus requiring governmental support to prevent wide-spread industry collapse, is a reasonable assertion. It just plain costs too much to plan for that kind of disaster in corporate disaster-response planning.