How Does DNS Failover Work?

Total Uptime’s DNS Failover automation is a powerful way of increasing availability for any type of web-based service or application that is accessible via multiple IP addresses. Some examples include:

An application behind two or more different ISP links
Two or more servers at the same or different sites
Two or more cloud or hosting provider virtual machines
(Essentially any number of static IPs anywhere!)

The failover functionality literally supports any type of application with an IP including:

Web Servers
APIs or Web Services
Mail Servers
VoIP Infrastructure
FTP/SFTP or other file transfer servers
Remote Desktop Services like RDP, VMWare view etc.
Anything that has an IP address and is accessible via DNS!

How DNS Failover Works in a Nutshell

Traditionally, to make an application available online, you publish the IP address in DNS. So as an example, vpn.example.com would point to the IP of 203.0.113.113 so when a user accesses the domain, they are routed to the correct server or device.

The idea behind DNS Failover is to increase service or application availability by automatically changing the IP address(es) given out in public DNS based on the availability of the device behind it. Without DNS Failover automation, if a device goes down, you must log into your DNS provider to change the address. Only after making that change and after the TTL expires across the internet, will users be directed to the newly entered IP.

The DNS Failover system automates this entire process and accomplishes this by monitoring all IP addresses the application is available on using criteria you set to determine if the device is responding properly. If it is responding to the monitors it is considered healthy and DNS gives out the corresponding IP address when requested. If it is not healthy, DNS withdraws that IP address from being given out.

This automatic changing does take about 2 minutes at minimum, but for applications that can handle a very small amount of downtime before switching or removing a downed device, this is a very cost effective and easy-to-configure solution that requires minimal technical expertise.

A DNS Failover example based on multiple ISPs

For the example of a single device behind two ISPs, each ISP has a static IP address that is routed over to the device. How you route it to the server is up to you, but the most common scenario is NAT at the firewall.

Both of the public static IP addresses are added into Total Uptime’s DNS Failover system for monitoring because even though they land on the same server, they get there via different ISPs and this strategy essentially allows for monitoring each ISP path. If the ISPs are of equal capacity and quality, you can have both active at the same time (load balanced). This scenario gives out both IP addresses at all times in DNS to distribute traffic among the two. It continues to monitor each ISP IP address and as soon as one fails, it stops using it until it starts working again.

A DNS Failover example based on two cloud servers

For the example of two devices serving the same web application at two different cloud providers, each cloud provider has assigned a static IP address necessary to reach the device.

Again, both of those public IPs are added into the Total Uptime DNS Failover system for availability monitoring. If the devices both have the same content and don’t require any special synchronization, like the ISP example, both IPs can be given out in DNS at all times with the load balancing method and the system can simply remove the one that fails.

If the servers do require special synchronization (e.g. a transactional ecommerce site), one of the IPs can be configured as a primary device and the other can be a backup device. DNS will give out only the primary device IP until it fails health checks. At that point, the IP is substituted for the secondary IP.

In this scenario, when the primary device comes back online, it can resume receiving traffic, or it can remain suppressed until the organization has synchronized the servers at which time they can manually fail back from the secondary to the primary.

Of course, there are many more examples that could be demonstrated, but the above two reflect the most common use cases. Here is another article that goes into further detail on health checks: How does the DNS Failover Service determine if a server is down, thus requiring a failover?

How to Point DNS to Total Uptime

In order to give Total Uptime the authority to answer DNS requests for the domain, the domain must be routed to our platform. There are three ways to accomplish that. They are:

Move the entire domain to Total Uptime
This method takes a little extra work, but the convenience and security of managing your entire domain at a single provider through a single panel (or API) always outweighs the risks.
Point just the critical records to Total Uptime with NS records
If you can’t (or don’t want to) move your domain to Total Uptime, your DNS provider might allow you to create NS records to point just the critical DNS record(s) to us. This is not possible to do for the root (or often called apex or naked) domain, but it does work for sub-domains. You do this by creating an NS record for (vpn.example.com, as an example) to our name servers. That way when a request comes in for that specific sub-domain, the request is sent to Total Uptime DNS servers for the response.
Point just the critical records to Total Uptime with CNAME records
Much like the NS record option above, some DNS providers don’t allow you to create NS records. In that case, you can request from Total Uptime a special DNS record which might be yourcompany.totaluptime.net. Within that domain, you can create your failover pools and then in your existing DNS provider you can create CNAME records over to the Total Uptime record. Much like NS records, this also does not work for the root or apex domain in the zone.

Here is an article that goes into further detail: Can I use your DNS Failover without switching my DNS to you?

Frequently Asked Questions about DNS Failover

How much does it cost? DNS Failover is a feature include in all of our DNS plans. Our smallest 10 Domain plan includes 10 DNS Failover Pools.

What is a DNS Failover Pool? A pool is a collection or group of IP addresses. IP addresses can be load balanced (all that pass the test are given out in DNS) or they can be configured in a primary, secondary (tertiary etc.) method. One pool is used for the group or collection, but it can be assigned to multiple “A” or “AAAA” (in the case of IPv6) records, meaning that if www.example.com has 203.0.113.1 as the primary IP address and 198.0.2.100 as the secondary and other records like vpn.example.com, mail.example.com etc. also use exactly the same primary and secondary IP addresses, they can all be triggered by the same failover pool. If they all use a separate primary and secondary address then in this scenario (for www, vpn and mail) you would need to consume 3 failover pools.

How many IPs can a Failover Pool have? 128 IP addresses can be added to a pool for use in load balancing or a cascading failover.

Can I fail from one load balanced set of IPs to another? No, not at this time. You can fail from a single IP to another single IP or if using the load balancing method, we simply remove IPs that no longer pass the monitoring test.

Can I weight the IPs if I load balance them so one device gets more traffic than another? No, we do not support weighting at this time.

Do you have more detailed help on all of the settings in the panel? We sure do. We have a fairly comprehensive online DNS manual which is always undergoing improvement.

What type of monitors do you have? We have over a dozen different monitor types. You can read about them in the online monitors manual or watch a video on how to create and manage monitors.

What if DNS Failover isn’t fast enough? Some organizations find that the 2-minute minimum failover time and the potential for longer DNS cache times are unacceptable. In this case, we highly recommend using our ADC-as-a-Service platform. It is similar, except we give you a static IP to publish in DNS that never needs to change. We then proxy traffic to the active IPs/Devices and the moment one fails, we stop sending traffic. This is as fast as 30 seconds without any reliance on a DNS change to propagate the internet.