If you heard or felt Internet outages today, it wasn’t just you. It seems many end-users were filing complaints on downdetector.com for a vast number of unrelated ISPs. From our global vantage point, we could see that the Internet of things was acting strange. There may have been a number of small unrelated events that contributed to the chaos, but what really seems to have triggered the vast amount of routing issues we saw was the fact that the IPv4 BGP routing table grew beyond 512k routes today as seen here.
At Total Uptime, our infrastructure sits in many datacenters around the world and traverses most networks, so we have a unique vantage point into Tier-1 and peering networks that many do not. We first noticed strange behavior when a Level3 link in Washington DC seemed to dead-end traffic shortly before 5:00 AM EST. This was non-impacting for us because we connect to many providers and can seamlessly reroute traffic, but this was followed by a lot of routing changes all morning and then in the early afternoon we saw similar behavior with level3 out of San Jose.
Overall during the day, it seems routes were changing significantly as various network providers worked to resolve this issue as quickly as possible. Steven Vaughan-Nicols at ZDNet seems to have written the best article we’ve seen thus far, and it is worth the read. Hopefully today’s events were significant enough to have flushed out the issue, but insignificant enough to have kept the pain to a minimum. Fortunately for the customers we protect with our Cloud Load Balancer, it was business as usual.