S3 Outage- Talk is Cheap...
Mass Cloud Provider, causes Mass Confusion in the Mass Market!
Yes, on Tuesday 28th February, Amazon suffered an outage to one of its busiest regions, US-EAST-1. Human error has now been to blame, which is not surprising, given if they blamed the underlying technology, product confidence would diminish considerably. Humans are an excellent scapegoat, given we all make mistakes!
There are many articles talking about how to stop it happening again, impacts on the market and just how damn influential AWS is on the wider internet, however...nothing about what is was like for the customers. The businesses. The ISPs that resell these services.
Echoing the thoughts of some other sensible posts around this topic, the AWS S3 seems to be this miraculous wake-up call for the entire industry about the "pitfalls" of "The Cloud" - not rubbish as you cannot deny the coverage, more, focusing on the wrong thing.
Multi-region replication, however more than one availability zone, using multiple vendors are all fair counters to stop history repeating itself and worth considering from a technical point of view, moreover, my interest lies in communication:
- > Who did people call when all of this was occurring?
- > How did AWS keep its users updated?
- > What messages were businesses sending out to their customers?
Although I will go into this in more detail in a future post, there seem to be three distinct models of engagement with Mass Cloud Providers:
1. Buy the service, survive on your own (no support or management)
2. Buy the service with first party support (optional management)
3. Buy a packaged service from an ISP or similar (usually with both)
Can imagine for those in the camp 1, Tuesday was a very scary experience.
Imagine being at a concert having a great time and suddenly everything shuts down, lights go off and nobody is willing to tell you what is going on!
Option 2 still has limits, as you must log in a portal and submit and request to be called back - like sending an email for an Ambulance when you have an emergency.
For magic number 3, things should have been more manageable, at the least you could communicate in some form. Full-Service cloud providers like Rackspace, IOMart etc provide direct lines of communication to their teams who in turn, have direct lines with AWS, Azure etc. For me, the knowledge of having the power to pick up the phone and get instant "relief" in this type of emergency is the real message we should be taking from this outage.
The world, the internet and business will always face crises - it is how it is managed that is the true test of your setup.
Talk may be cheap, however, when you lose the ability to, you quickly realise it's worth.
For any help, advice or guidance on how to set up a more communicative cloud setup, feel free to contact Root Provider -