Notable Cloud Outages of 2020
Periodically we talk to someone who says something along the lines of: “we don’t need Total Uptime since we moved to the cloud.” A most interesting statement that is simply the result of someone misinformed about the benefits and capabilities of cloud.
Does the cloud provide incredible flexibility? Yes! Does it let you provision services on-demand at a moment’s notice? Yes! Does it help you save money due to its extreme affordability? Um, no! Are clouds 100% available and uber resilient? Not even in your dreams.
To prove our point on availability, we have compiled our annual list of the most notable outages of 2020. COVID made the entire year beyond horrible for the entire planet, and these notable cloud outages didn’t help whatsoever especially considering more and more users began working remotely in 2020 than ever before.
- January 27th – Google G-Suite suffers an outage for just under an hour affecting many users on a Monday morning. more>
- February 3rd – Microsoft Teams suffers an outage caused by an expired authentication certificate. more>
- February 25th – Google has a 16-hour outage for Nest video streams and recording that left all users of the security cameras without… security. more>
- March 3rd – Microsoft Azure was struck with a 6 hour outage in the US East data center caused by a cooling failure affecting almost all Azure services. more>
- March 15th – Microsoft Azure experiences a power event in the West Central USA region affecting virtual machines, SQL Server and many other services. more>
- March 17th – IBM Cloud was hit with a mystery outage affecting services in the United States as a result of a connectivity issue in a Dallas data center. more>
- March 26th – Google Cloud platform has a 14 hour outage which affected services in multiple regions including Dataflow, Big Query, DialogFlow, Kubernetes Engine, Cloud Firestore, App Engine and Cloud Console caused by a lack of memory in the company’s cache servers. more>
- March 26th – Google services including G-suite, Gmail, Google Drive, Hangouts and Classroom went offline as a result of a significant router failure at a data center in the South Eastern US. more>
- March 30th – European cloud giant OVH suffered a 40 minute cloud outage in France affecting dedicated and virtual servers, domain names, the cloud platform, its anti-DDoS protection and support. more>
- April 8th – Google Cloud Platform blames a sweeping outage affecting a vast array of services on IAM API Issues. more>
- April 16th – Cloudflare has a 4 hour outage caused by someone pulling out cables incorrectly when decommissioning hardware at a data center. more>
- May 12th – Slack suffers a 48 minute outage affecting the entire platform. Not a big deal unless you’re WFH these days. more>
- May 18th – Microsoft Azure has an outage in the Central India region impacting compute and storage resources. more>
- May 27th – Adobe Cloud has a significant outage affecting use of their software worldwide more>
- June 9th – IBM Cloud suffers a two-hour outage of its entire global cloud blamed on an external network provider. more>
- June 15th – Microsoft 365 and Azure suffers an outage in Australia and New Zealand. more>
- June 15th – T-Mobile suffers a massive nationwide voice and data outage and blames a third-party leased fiber network. more>
- June 24th – 30 services on IBM’s Cloud suffer an outage for time ranging from 100 minutes for Continuous Delivery to 19 hours due to a power outage. more>
- June 29th – Google Cloud suffers an outage on its Kubernetes platform and networking services for several hours in their us-east region. more>
- July 13th – GitHub started the week with more than 4 hours of downtime. more>
- July 17th – Cloudflare takes out a chunk of the web for about 20 minutes when one of their global backbone routers announced bad routes affecting websites and their popular free DNS resolver service. more>
- August 18th – Equinix suffered a major outage at their LD8 data center in London affecting numerous customers in the hosting, cloud and telecommunications sectors including the London Internet Exchange (LINX), one of the world’s largest. more>
- August 20th – Google cloud services including App Engine, Cloud Storage, Cloud Logging and BigQuery suffer a few hour outage. more>
- August 24th – Video conferencing provider Zoom had a three-hour outage affecting many of their 115 million daily active users. more>
- August 30th – CenturyLink / Lumen / Level3 or whatever they are called today knocks out web giants and 3.5% of all internet traffic. more>
- September 9th – IBM Cloud suffers an outage in their Sydney data center after it loses power in “multiple racks”. more>
- September 14th – Microsoft Azure suffers a 4+ hour outage in one of its southern UK zones caused by a cooling system failure. more>
- September 28th – Microsoft Azure Active Directory suffers a 3-hour outage affecting Office, Outlook, Teams. more>
- October 1st – Microsoft’s Exchange Online service suffers another global outage. more>
- October 7th – Microsoft Office 365 and Azure suffer a 4+ hour outage affecting Teams, Outlook, SharePoint and OneDrive. more>
- November 5th – Microsoft Exchange Online suffers a 12 hour outage for many users around the globe. more>
- November 5th – GoDaddy-owned 123 Reg has a six-day DNS record-edit outage. more>
- November 11th – Microsoft’s online game services hit by an outage on Xbox debut. more>
- November 25th – Amazon Web Services (AWS) outage on its Kinesis data streaming service impacted major customers including Roku, Adobe, Flickr, Glassdoor, Autodesk, The Wall Street Journal, 1Password and others including Amazon’s own home security camera company Ring. more>
- December 9th – Google has an 84-minute outage at their Europe-west2-a zone causing 60% of virtual machines within the zone to be unreachable from the outside world due to a bad ACL that caused BGP routes to withdraw and the Europe-west2-a zone to become isolated and inaccessible.
- December 14th – Google suffers a 50-minute outage caused by a capacity issue in their central identity management system. This resulted in outages affecting Cloud Console, Cloud Storage, Google Kubernetes Engine, Gmail and more. more>
- December 16th – Google has a full-on Gmail outage which was more than just delaying email delivery or access to mail, they were permanently bouncing messages. more>
- December 17th – Google has another outage affecting Nest, much like the one in February, except significantly shorter at just over two hours. more>
Our 2020 Conclusion
What can we really say to sum up a long list of major cloud outages that transpired in 2020? We haven’t even touched on the hundreds or perhaps thousands of smaller events that impact enterprise application availability every day on a smaller scale, insufficient to make the news but sufficient to injure brands significantly.
The bottom line is that outages happen continually, and we can assure you that they will never cease. Cloud services in general may be more reliable, on average, than on-premise services, but the impact to availability when they fail is enormous. As more and more organizations continue to put everything in the cloud (all eggs in one basket, from our perspective) we would recommend a contingency plan. Perhaps that’s multicloud? Or better yet, maybe it is an ADC-as-a-Service layer that will provide a layer of control and resiliency at the right time.