A Primer on the Domain Name System (DNS)
In the early days of the Internet, humans and other computers located the few massive interconnected computers the same way: by their numeric Internet Protocol (IP) addresses. Soon, the impracticality of memorizing all these numbers became obvious and a rudimentary naming scheme was developed. A central repository of names and their associated IP addresses was created and maintained in the form of a plain list. Periodically, administrators would connect to this central repository and download the current list of computer names. As the number of computers (hosts) on the Internet increased, this file began to grow exponentially and keeping the hosts file up-to-date became a much more daunting task. The early pioneers of the Internet realized that this system would prove to be very difficult to scale.
With that, the host-naming process went under the knife in order to develop a more scalable system with distributed management. And so, the Domain Name System was born. The proud parents proceeded to inscribe every detail of their creation and created the DNS Internet standards as defined by RFC 1034 and RFC 1035.
DNS fulfilled its goal of becoming an efficient, distributed and scalable system for resolving human-readable hostnames to network-usable IP addresses. In fact, DNS even included support for other classes of addresses (“CHAOS” and “Hesiod”) but these are not in wide usage today.
The designers also decided on a particular hierarchy. They created a naming format in which the computer would be known by its name, followed by a hierarchical list of domains, which were simply logical zones that the computer fell into for management purposes. These names are separated by dots and written in reverse order (broadest domain last). For example, a typical name looks like this:
Since the domain name trees from right to left, we can interpret that the broadest domain is “com”. The “com” domain represents the portion of the name space that is set aside for commercial enterprise, and most typically for commercial entities within the United States. This portion of the name is known as the top-level domain.
“totaluptimetech” represents what is called the second-level domain name and generally represents the firm that actually owns or controls the domain and all the hostnames beneath it.
This leaves “www”, which is the hostname or the actual computer name as assigned by local administration. Incidentally, the combination of the second-level and top-level domain is what is commonly referred to as the domain name.
Entities who wish to set up and control their own domains must choose a top-level name space to operate in, and decide on a second-level name. Then, they must contact one of a group of central authorities that oversee the top-level name space and register this domain name. These central authorities are appropriately deemed registrars.
While selecting a second-level domain name that directly represents your firm’s name is a convention, there is no rule requiring it. Anyone can register any name they choose providing that the standard naming rules are adhered to and the name is available.
Now that we know why a naming system exists and have seen what a name looks like, let’s take a look at how a name actually gets translated back into an address that computers can use. The easiest way to understand this process is with an example.
Let’s imagine that an Internet user wants to look up information on a Web site. We’ll use a fictitious site, “https://www.gwlg.com”.
First, the user types the above URL into his Web browser and the computer begins processing the Web request. The user’s computer must find the IP address for “www.gwlg.com” before it can contact the correct server. (The initial characters “https://” simply tell the Web browser what protocol to use to contact the remote computer once it has an address.)
The Web browser then turns this hostname over to the resolver. A resolver is just a computer program or process that runs silently on any computer connected to the Internet that needs to be able to translate names to IP addresses. Its sole purpose is to perform this task.
The resolver checks its own internal tables to see if it has any information stored or cached containing the IP address for the requested hostname. If not, the resolver checks its configuration for the IP address of a name server to which it can pass the query. In most cases, the client resolver will have to connect to this name server to answer the query, unless it has been looked up recently and is in the resolver cache or has been manually entered in the local resolver’s host table.
The next server in the chain for the average user is usually a DNS server that an Internet Service Provider owns and maintains. Corporate entities may maintain their own DNS servers. The location of the server is not important, so long as the resolver can connect to it reliably. This server is usually very near the user’s Internet connection point and is referred to as the recursive server.
It is at this step in the resolution chain that we must introduce the concept of authority. Every registered domain name is required to select two or more name servers that will pass on “official” data for that domain name to the rest of the Internet when asked. Domain owners can’t just pick any name server. They must either maintain their own name server or coordinate with the administrators of an existing server to provide DNS services.
The recursive server, upon receiving a query from the resolver, checks to see if it is authoritative and has the information for the zone requested in its local configuration or cache. If a server is supposed to be authoritative for a domain but does not have any information for the domain or isn’t configured to handle it, it causes a “lame delegation,” where no queries are properly answered.
If the recursive server can’t resolve the query on its own, it too will need a “next step” to take. But, what servers can it ask, and how does it know about them? A name server that responds to queries should have a file containing a listing of names and addresses of Internet root servers.
A root server is a particular type of domain name server on the Internet that stores top-level naming information and second-level delegations. Simply, root servers tell the recursive servers where to go find out more information about the domain they are querying.
Now, our query has made it from the client resolver to the recursive server to the root server, and the root server has found the server that should have the necessary information to respond to the query. We will call this the authoritative server. The root server now passes this address back to the recursive server.
Finally, the recursive server must contact the authoritative server and issue the original query. It will then return that result to the recursive server. The recursive server delivers the result to the client and potentially caches the query result.
The client resolver hands off the IP address in question to the actual network protocols to locate the IP address and establish a connection, and the resolution process is complete.
Resource records are the foundation of DNS. Every piece of information that DNS can provide about a host or domain is stored as a resource record (RR). Dozens of different resource record types exist to help define the types of DNS information available. We’ll take a look at a few common types.
Address (A) records are the “meat” of DNS. The A record stores the IP address associated with a given hostname. Most DNS operations are queries for A records. A few things to note about A records:
- An A record must always point to a single IP address. No other form of entry is acceptable.
- Multiple A records can be entered with the same name (called a label). The DNS server will return all the IP addresses listed. Clients will generally try the first address listed, so order is important. Depending on implementation, this order can be round robin or selected based on topological proximity.
- Multiple labels can be assigned the same IP address. In this case, querying any one of the labels will return the IP address.
- It is possible to have an A record for a label that has the same name as your domain. For example, “gwlg.com” is a domain but an A record can also be created to make “gwlg.com” directly resolve to an IP address.
CANONICAL NAME (CNAME)
Canonical Names (CNAMEs) are the DNS equivalent of aliases or symbolic links. This record’s function is to point a hostname to another hostname. For this to be useful the “destination” hostname must have an A record which points to some IP address. A few things to note about CNAMEs:
- CNAMEs can point to any hostname on any domain anywhere in the world regardless of who owns the domain or where it is located.
- CNAMEs require that both the destination host and the destination host’s A record (IP address) be returned in order to properly resolve. As such, CNAMEs are generally slower than A records and should be used sparingly.
- A domain name cannot be used as a CNAME label. For example: setting up “gwlgeeks.com” to resolve as a CNAME to “www.someprovider.net” will not work.
- CNAMEs cannot point to URLs, nor can they point to specific directories on your Web server. A CNAME can only point to a hostname with a valid A record.
Pointers are essentially the opposite of A records, in that they resolve IP addresses back to hostnames. Although it is not a required function of DNS, some applications like to use an inverse query to authenticate or provide more information about a connected or connecting host. A few things to note about PTRs:
- Just like a domain name, a name server must be configured to be authoritative for the block of IP addresses and the root servers must be aware of this delegation.
- The smallest standard block that can be authoritatively delegated is currently 256 IP addresses (otherwise known as a “/24″ or somewhat incorrectly as a “Class C”). There are proposed mechanisms to allow sub-delegation of PTR responsibility to even smaller blocks.
- You do not have to set up PTR’s for your hostnames and your domains to resolve correctly. Some applications may call for it, but it is not required by any Internet standard.
START OF AUTHORITY (SOA)
The SOA record defines the given name server’s authority for the domain. In addition to authority, the SOA record contains several configuration parameters for the domain, as follows:
- Person In Charge – Email address of the person responsible for the domain’s administration.
- Serial Number – This number must be incremented each time a change is made to the records for a domain/zone. If a zone is changed but the serial number is not updated, the secondary server will not acquire the new data when it refreshes its zone information.
- Refresh – How often, in seconds, a secondary name server is to check with the primary name server to see if an update is required.
- Retry – If a secondary server tries to poll the primary server and fails, the secondary should wait this number of seconds before trying again.
- Expire – If the secondary server is not able to update its data by contacting the primary server for this number of seconds, it will stop using the data it has for queries, in case the data is outdated or inaccurate.
- Minimum TTL – TTL time is a per-record specification that tells any querying name server how long it should keep that particular record in its cache. The Minimum TTL field is a zone-wide default that is used when a record does not explicitly specify its own TTL time.
NAME SERVER (NS)
NS records supply the hostname of the authoritative name server(s) for the domain. Every domain must have an NS record and current RFC guidelines specify no fewer than two. Domains can also be divided into sub-domains as specified by local administrators and each sub-domain can have its own NS records.
MAIL EXCHANGER (MX)
MX records specify the hostname of the server that will handle mail for this domain. When you send mail to firstname.lastname@example.org, your local mail server has to contact the server that handles mail for “gwlg.com” and pass the email on to it. But which server should it contact? mail.gwlg.com? postoffice.gwlg.com? gwlg.com? Or, perhaps some server belonging to an entirely different domain?
Since the server that handles mail for a domain could feasibly be any server on the Internet, the host attempting to deliver mail must have some way to find out the address of the server to contact. This is precisely the role of the MX record.
The MX record has three parts: a domain name, a hostname and a preference value. The domain name for the above example would be “gwlg.com”. The hostname is the name of the server to which mail for this domain will be delivered. Incidentally, this server must also be configured to accept and handle mail for the given domain.
The preference value is a number (usually between 0 and 100) to indicate which MX record to try to use first if more than one exists. A lower number will always be used before a higher number. This allows for some redundancy if the preferred mail-handling host loses connectivity or the ability to accept mail for delivery.
A Few Caveats about MX records:
- MX records are not equivalent to email addresses. They cannot contain a user name, only a hostname. The mail server for your domain handles “everything before the ‘@’” on its own.
- MX records should never point to a CNAME record, only a host that has a valid A record.
- MX records cannot point to an IP address.
- The server you are pointing to will not begin handling mail for you until you let the server’s administrator know and that administrator configures the server to accept email for your domain.
The character used as a wildcard in most DNS implementations is the asterisk or “*”character. You may use this character in certain resource record types to match any hostname beneath your domain.
So what happens when you set up an A record for “*.gwlg.com” to point to “127.0.0.1″? Every possible hostname will resolve to that IP address. So, with this entry in place, “www.gwlg.com” would resolve to “127.0.0.1″, “mail.gwlg.com” will resolve to “127.0.0.1″ and even “we.love.gwlg.com” will resolve to “127.0.0.1″, all because of a single wildcard record that matches everything.
You can also do this for MX records. Suppose someone sends mail to “email@example.com”. We’ve just read that an MX record lookup for “gwlg.com” would be performed and that mail would be delivered to the host returned in the MX query. But what if someone sends mail to “firstname.lastname@example.org”, or “email@example.com”? The answer is that the mail will not be delivered, since an MX record for these hostnames does not exist. Rather than adding every possible hostname as an MX record, DNS allows you to specify “*.gwlg.com” as an MX record label, to catch all possible hosts.
We hope that this guide has given you a basic understanding of the Domain Name System. If you should desire a more in-depth tutorial, we recommend the book DNS and BIND, published by O’Reilly. There’s no substitute for a good book and hands-on experience, but the information contained here and in the other sections of our Website should be enough to get you started.