API Documentation
Home > Knowledge Base > DNS > How Fast is DNS Failover?

How Fast is DNS Failover?

We’re often asked how quickly an ‘A’ or ‘AAAA’ record controlled by a DNS Failover pool can update across the internet.

TLDR: 2.5 minutes is about as fast as it can go.

If you’re curious how we came up with this number, here’s the backstory on how it all works.

DNS Failover update speed is determined by three things:

  1. The monitor attached to the failover pool entry
  2. The propagation of the DNS change across the Total Uptime network once the monitor detects a change
  3. The speed that DNS caches clear for the old record across the internet

The Monitor Attached to the Pool Entry

DNS Failover entries must be monitored to determine the availability of the device behind the IP. The health check interval and retry count will determine how frequently the device is checked, and how many consecutive checks are required to confirm a state change for a device.

The fastest monitors we offer are the built-in 10 second checks. These monitors check once every 10 seconds and for reliability, require 3 consecutive checks to alter the state. This means that within 30 seconds DNS Failover monitoring can detect a change in the device state with one of these built-in monitors. If you choose to create your own monitor, then they can be configured to check at most every 1 minute with a minimum of 1 additional retry.

It is also important to note that DNS failover triggering requires confirmation in order to minimize false failures. We monitor your device(s) in the DNS Failover Pool entry from 8 regions around the world at all times, each checking at the interval of the monitor. Just because one of our monitoring regions detects a device state change, it doesn’t mean it’s necessarily down. There is always the possibility that it is an ISP issue somewhere between our monitoring endpoint and your device.

In the panel, you control how many of our 8 global monitoring regions must agree on a device state change before making a DNS change. This is what we call the “Failover After” value. We set it to 5 by default, meaning 5 of the 8 regions must see a state change, but you can increase or decrease this to suit your requirements. Obviously if you decrease this value, you may get into false failover territory. And if you increase it, reliability definitely increases, but it may take a few seconds longer before enough regions report in with their status. So it’s a careful balance.

The propagation of the DNS change across our network

Once the monitor detects a device status change and the right number of regions agree, the DNS ‘A’ or ‘AAAA’ record change(s) in your zone(s) are queued across our platform. Pushing the DNS change out generally only takes a few seconds. While that’s not very significant, it is part of the overall process and is worth mentioning. Our SLA guarantees that pushing a DNS change out (manually or even automatically via DNS Failover automation) will not take more than 60 seconds, but this is often significantly faster.

The speed that caches clear for old DNS records

The last part of the process is waiting for DNS caches to clear across the internet. When you first attach your failover pool to your ‘A’ or ‘AAAA’ record, we automatically change the TTL of that record to 60 seconds for optimal update time.

Why not lower? Savvy customers know that DNS supports a TTL (time to live) down to 0 seconds, meaning “do not cache”. So why not set the TTL to the lowest possible setting? The answer has to do with ISP behavior we’ve studied over the years. We have found that when the TTL is lower than 60 seconds, ISPs tend to ignore these low values and replace them with their own arbitrary setting, often ranging from 300 to 3600 seconds. We assume they do this to prevent unnecessary queries from their resolvers to the authoritative servers. What we have found is that a TTL of 60 seconds generally does not get overwritten by ISPs and is reliably accepted.

This is why at Total Uptime the lowest TTL value we allow is 60 seconds. Who wouldn’t want near 100% assurance that a DNS change was going to take effect within 60 seconds vs. setting it to something lower that might be ignored or overwritten by 25% or more of the ISP networks out there.

Once DNS resolvers expire their cache, the next DNS query that comes along will force a fresh query to the authoritative name servers of the zone to retrieve the new IP that is waiting there. And that’s it!

Adding it all up

How did we come up with the 2.5 minute change time we mentioned at the beginning of this post?

1. 30 seconds for our fastest monitor to detect a state change
2. 60 seconds max for our DNS network to be fully updated
3. 60 seconds (or so) for DNS resolver caches to clear around the world and refresh

Generally we find that step 1 might take 40 seconds, step 2 to take 30 seconds and step 3 to take a little additional time. So all things considered, we reliably see changes in about 2.5 minutes regardless of how the times go for each of the three steps above.

If you have any questions about DNS Failover and how we can help you achieve automated IP changes, feel free to contact us. We’re pretty confident we can help your organization increase availability online.

Need DNS Failover?

Try it for free!