The October 2016 cyberattack on Dyn should have been an object lesson on how to build Domain Name System infrastructure that would resist a distributed denial of service attack. Unfortunately, I believe we have yet to incorporate the fundamental lesson from this attack.
The DDoS attack on Dyn occurred just over two years ago. The Mirai botnet, a botnet that consisted of hundreds of thousands of compromised “internet of things” devices, was used to send an enormous amount of traffic at Dyn’s authoritative DNS servers, which rendered them incapable of responding to legitimate queries. Major organizations that relied on Dyn for their authoritative DNS service, including Twitter, CNN, Netflix and The New York Times, were unreachable for hours.
I believe one of the central takeaways from the Dyn attack was — as simple as it seems — that you shouldn’t put all your eggs in one basket. In DNS terms, this means that you shouldn’t rely exclusively on a single DNS provider to host your internet-facing DNS data. Organizations that relied on Dyn were unreachable for hours during the attack, whereas organizations that hedged their bets by taking the precaution of using multiple providers weathered the attack with minimal downtime.
I gave a talk in London the month after the attack in which I reminded listeners of the “Multiple Egg-Basket” rule — something I had actually stopped mentioning years before because it struck me as too obvious to warrant a mention. One of the attendees caught me during the next break and told me that his company happened to have exactly the setup I’d recommended: They were a Dyn customer, but they also used a handful of their own external DNS servers. As they relied heavily on their online presence, they used a third-party service to monitor the availability of their website 24 hours a day. During the hours-long attack on Dyn, they were only briefly unreachable.
It seems like a simple precaution, right? Unfortunately, it’s not always as simple as you might think. It’s very easy to synchronize basic DNS data among multiple providers. If, for example, you want to use Dyn and one of its competitors to host your internet-facing DNS data, you generally use one provider to manage that data and tell the other provider to get their copy of the data from the first; this is what we refer to in the business as “secondary DNS servers.”
However, many DNS providers now offer, and some customers use, value-added services, such as traffic distribution based on a querier’s location; this enables a customer to direct a querier to the closest web server. In my experience, it’s those value-added services that pose a problem because there’s no standard way to synchronize their configuration among providers. If you laboriously configure Provider A’s system with rules to send all of your customers to the closest web or application servers you offer, you’d have to do the same with Provider B while using the proprietary interface they offer. And if you change Provider A’s configuration in real time in response to conditions, such as if one of your web or application servers failed or was brought down for maintenance, there’s no standard way for that provider to inform the other.
There have been discussions within the Internet Engineering Task Force, the organization responsible for developing and enhancing internet protocols, to come up with some standard means of specifying and synchronizing these value-added services, but there is still progress to be made. Even if such a mechanism existed, there’s not much incentive for providers to support it: Most providers charge customers based on the volume of queries they receive, so when you make it easy for another provider to serve one of your customers, you’re making it just as easy for them to take some of your revenue.
But the benefits of using multiple DNS providers, in my opinion, are important enough for customers to insist that their providers offer some mechanism — perhaps based on the transfer of well-documented metadata or the use of a well-designed API — to synchronize these value-added services. Only then can we implement the lessons that the attack on Dyn should have taught us.