What Went Down in 2017
The internet is replete with Top Ten lists and other rankings. But the criteria for distinguishing between #1 and #10 is often no more than personal whim. So with the caveat that these are not necessarily the worst or the biggest, we’ve decided to list and describe some of the most interesting outages in the world of cyberspace in 2017. We comment on outages that stood out (according to our own whims) and offer a larger list at the bottom that you can look at more closely on your own.
Before we begin, we should credit our sources. CloudEndure has provided a description of 7 notable outages for each quarter of 2017: Q1, Q2, Q3, Q4. CRN compiled “The 10 Biggest Cloud Outages Of 2017”. Plus we have links available for each of the outages mentioned. And of course, we scoured the internet with good ole Google to find more sources about 2017 outages. The list is not meant to be comprehensive, and someone else might come up with a much different one. That said, let’s get started -- and we’ll go chronologically just to be fair.
Systems used by border control agents at major U.S. airports suffered a temporary outage from 5:00 p.m. to 9:00 p.m. on the second day of the new year. The NY Daily News reported that the technology disruption created havoc and extensive delays. The agency didn’t even report the incident until 10:30 p.m.
January 22, 2017 -- United Airlines Grounds Flights
This two-hour flight stoppage was was put down to bandwidth issues. The Aircraft Communications Addressing and Reporting System, or ACARS, reportedly suffered “low bandwidth” from 7:00 p.m. to 9:00 p.m. Those are all the technical details we have about the problem. It was a nationwide outage.
January 26, 2017 -- IBM Softlayer Issue
The Register called it a “meltdown”, and quoted a user as saying that it had been a mess ever since BlueMix took over. "Since IBM came along there have been loads of outages, planned and otherwise." The user also said, "In the three years of service prior to this we had only one outage, in the six months after they took over we have had one outage that knocked out their AMS [Amsterdam] data center for four hours….” An IBM spokesman said the problem was due to a “planned update to extend our feature set”.
January 31, 2017 -- GitLab Loses Data
Chalk this one up to human error. A company systems administrator meant to delete the backup database to make some space. Instead he deleted the primary database. One good thing about it is that the software company was given credit for being transparent about the issue.
February 16, 2017 -- CDBaby Gets Glitchy
Billboard magazine reported that the music distributor suffered from “rolling outages” throughout the weekend due to “database issues”. The database serve 500,000 clients and includes seven million music tracks.
February 24, 2017 -- Facebook Login Issue
At some time in the morning, a wide swath of Facebook users got this message: “It looks like someone may have accessed your Facebook account. To secure your account, you’ll need to answer a few questions and change your password the next time you go to Facebook.” The company said that their system inadvertently sent “a small set of people” to their “account recovery flow”. Check out Down Detector’s outage map for this “small set”.
February 28, 2017 -- Storage Problems at AWS
Another case of human error affected Amazon Web Service’s Simple Storage Service (S3). The company apologized after an employee deleted a much larger number of storage servers than intended. “The servers that were inadvertently removed supported two other S3 subsystems,” AWS said in their postmortem.
March 15-16, 2017 -- Microsoft Azure Down for Seven Hours
Winbuzzer reported: “The outage was almost a blanket loss of service across all Azure regions. 26 out of 28 data center regions were affected by storage issues.” Microsoft said that the problem started in one region and spread from there.
March 16, 2017 -- Square Outage Means No Dinner
Restaurants had to turn away customers for two hours because Square’s point-of-sale software was offline. Patrons with good old-fashioned cash could eat, but those paying with cards were out of luck.
May 12, 2017 -- WannaCry Ransomware Hits the Web
A hack and an outage may not be the same thing, but if a hacker holds your computer for hostage so that you can’t use it, you may as well call that an outage. This hack had a pretty big impact. According to a May 19th report from CNET, “So far, more than 200,000 computers in 150 countries have been affected, with victims including hospitals, banks, telecommunications companies and warehouses.”
May 15, 2017 -- Starbucks Has Register Problems
The coffee giant is not immune to point-of-sale failure. Fortune reports that the “caffeine crisis”, which started at 2:00 p.m., was due to a technology update. Some stores gave out free coffee to lucky customers.
May 27, 2017 -- British Airways Grounded
The airline’s CEO blamed the IT failure on a catastrophic power surge -- and not on recent job cuts. He refused to resign, saying that the problems “have all been local issues around a local data centre.”
June 19, 2017 -- Skype Has Issues
The title TechCrunch’s article, “After three days, Skype’s outage is resolved”, is enough to raise eyebrows. The Down Detector map shows a big blot all over Europe for the outage. Was it a problem with the Heartbeat virus? Skype was mum for days. Another article states that the hacking group CyberTeam later claimed responsibility.
June 27, 2017 -- Petya Attacks
Speaking of viruses, Forbes reports that Petya is worse than WannaCry, “with pharmaceutical companies, Chernobyl radiation detection systems, the Kiev metro, an airport and banks all affected”. "This is going to be a big one. Real big one," says former NSA analyst David Kennedy.
June 28, 2017 -- Apple iCloud Backup Problems
This one had to do with Apple iCloud storage. Apple device owners were unable to make backups or restore backups for several days. The cause is unclear.
June 17, 2017 -- Game of Thrones is Too Popular for the Web
The demand the season 7 premiere of Game of Thrones was just too high for global streaming services to handle. Imagine how mad people became with this one! A statement from Foxtel explained, “We are devastated that due to unprecedented demand, we are experiencing problems with our online services this evening,”
July 25, 2017 -- A Big Oops by Marketo
Did you ever forget to renew your domain? That’s what happened to the web marketing company Marketo. Actually the automatic domain renewal failed. There was plenty of mockery on the web after that one.
August 18, 2017 -- State Department Email Down
This one might be important. Politico reported that the U.S. State Department suffered a worldwide email outage. Officials said it was not due to "any external action or interference”.
August 31, 2017 -- WhatsApp Outage Affects Thousands
Nearly 5,000 users had problems with the chat app, which was bought by Facebook in 2014. It has over a billion users. The outage meant that some users couldn’t connect or send and receive messages.
September 12, 2017 -- Florida Power Company’s Website Down
It’s bad enough when the power goes out. But you would hope that you could get status updates from the power company’s website. Adding to customer frustration after Hurricane Irma, Florida Light & Power’s website crashed. However, the IT problem didn’t keep faithful electrical workers from continuing with their repairs.
September 16, 2017 -- Payout Delays at Graton Casino
There was confusion on the gaming floor after network problems at the Graton Resort and Casino. Employers had to hand-deliver payouts. As many as 1,000 patrons had to wait for hours when automatic payments were down.
October 31, 2017 -- Worldwide Outage at Slack
What do you do if your collaboration portal is down? FastCompany says you take the night off. The messaging app has 9 million weekly active users per CNBC.
November 15, 2017 -- GoogleDocs Down for an Hour
An hour may not seem long, but for a platform like GoogleDocs, the impact could be huge. And Google said that the outage affected a “significant subset of users”. That means a lot. The Verge pointed out that it happened in the middle of the U.S. work and school day.
November 24, 2017 -- Black Friday Panic at Macy’s
It seems that Macy’s must have lost a lot of sales because it’s credit card machines were down on Black Friday. America’s biggest shopping day is already chaotic enough. One Twitter user wrote, “ I have $300+ items in my hand. They're only accepting cash or Macy's credit card... Let the riots begin.”
December 6, 2017 -- Reddit Down
Users of Reddit, one of the world’s biggest websites, were mysteriously unable to load pages. Instead of the usual content, desktop users saw a sad version of the Reddit logo and a cat that had pulled the computer cable out.
December 7, 2017 -- Bitcoin Exchange Crashes
Bitcoin is all the rage these days. It must have been disappointing when the cryptocurrency exchange Coinbase crashed. The reason given by the company was “record traffic”.
December 25, 2017 -- Nintendo eShop Offline
Gamers who opened their new Nintendo Switch gaming consoles on Christmas morning were likely upset if it didn’t come with a few games. Those wishing to buy games online through the company’s eShop were met with a message that started, “An error has occurred….” So much for a Merry Christmas.
There were plenty more outages to report on for 2017, and you may know of many more yourself. You may be wondering where the outages were for April. So were we. Surely something went down during April. If you’re really curious, you’ll have to dig a little bit on your own.
Other posts you might like...
The True Costs of Downtime for IT
Downtime is a dirty word in the IT business. Unplanned outages are unacceptable and should not be tolerated. In a universe where customers expect services to be available 99.999% of the time, any time your IT service offering is down is costly to your business.
The Need for Increased Availability is Now
Our predictions for the last half of 2017: Ransomware will keep evolving, the rise of IoT will pave way for increased DDoS Attacks, IPv6 Traffic will continue to grow exponentially, Machine Learning and AI will be applied to enhance security, and the need for increased availability is now.read more
5 Ways to Increase Application Availability
A service provider that offers software-as-a-service or another cloud-based solution should understand what customers are looking for and what compels those very customers to choose an off-premise, “cloud-based” solution vs. the more traditional on-premise, self-hosted solution.read more