Probably everyone noticed what happened to Facebook, Instagram, and Whatsapp this week. Because of a human error, the platforms were no longer reachable and caused millions of people to open their favourite app every half an hour to check if they could connect with their friends again.
Now, what causes something like this? What happened when Facebook ‘was down’. I’ll try to shed a little bit of light on this topic and share some implications.
The internet is basically a network of networks. Those networks can be an internet service provider (ISP), a university system of computers, a government agency’s computer system, or something that contains information within a ‘network’, called an Autonomous System (AS). Now all of those networks need to be connected to access a file stored somewhere else in the network. You can get there using what you call a Borderline Gate Protocol (BGP); it figures out which route through the networks you should take to access the file you want. It does this by listening to the ‘Prefix announcement’. These announcements by an AS tell what IP Addresses (i.e. Facebook) they can deliver traffic to. The AS either hosts this IP Address itself or is connected to an AS that does.
All AS’es are shouting out the IP addresses they can deliver traffic to. The BGP picks up this signal and finds the best way to get there, hopping from AS to AS until it arrives at the IP address containing the correct data. What happened to Facebook was that someone updated the company’s BGP records, which took away ‘the map’ that made Facebook visible in the network. Whenever someone tried to connect to Facebook, no AS could tell where to go. You can compare this with your friends sharing their live location so you can go to their current location, but suddenly, you can’t see them on the map. It doesn’t mean they have evaporated or disappeared from the earth, but you don’t know where they are.
The outage has huge implications
That you can’t share a meme on Facebook for a few hours probably won’t do much harm, not liking a cat video neither. But communication channels like Whatsapp are commonly used for communication on the work floor (or as student for a project), and is increasingly used for connecting with businesses. Imagine an outage of Google’s DNS server, which millions of people rely on. That will not only affect consumers, but would disrupt commerce and critical infrastructure, factory production, fleet transport, basically any industry you can think of.
The Facebook outage has drawn attention to the vulnerability of the world to failures of this nature. The internet is a complex system of systems in which much can go wrong and following Murphy’s law “anything that can go wrong, will go wrong”. It also leaves room for criminals to highjack a BGP and redirect traffic to a malicious website; with 80,000 autonomous systems, it is not surprising that some would be untrustworthy.
Hopefully we can continue to share our social lives online without any inferences anytime soon, but when it happens, you know why and know the consequences.
Sources:
https://www.wired.com/story/why-facebook-instagram-whatsapp-went-down-outage/
https://krebsonsecurity.com/2021/10/what-happened-to-facebook-instagram-whatsapp/