VIDEOS


 

White Papers


 

Evaluation

The Domain Name System (DNS); How Computers Find Each Other On Networks By Name.

Introduction

Though you may not realise it you have used DNS every time you have used the Internet. Almost every time you click in a web browser, send mail, use FTP or telnet you are using DNS. Computers are good at remembering numbers. Humans are much better at remembering names. Every computer on the Internet has a unique number and a name associated with it. If there were no names then each time you wanted to send mail to someone you would have to remember the number of their computer. There are millions of computers connected to the Internet with more coming online every day. That’s a lot of numbers to remember. DNS provides the service whereby you don’t even have to know that the numbers exist.

DNS is the domain name service used by computers to determine the addresses of other computers from their name. Computers on the Internet use this service to communicate with other computers in order to send mail, find web pages and to perform anything else that requires contacting another computer.

Some Terminology

This would not be a real computer topic if there was not a lot of associated terminology. Firstly, a computer is often called a host or a node. The name of a computer is called a hostname. If the name of a web site is www.interaus.net, then the hostname is "www", the name up to the first dot. The section after the first dot, "interaus.net", is called the domain name. The host "www " is said to be in the "interaus.net" domain. Together the name "www.interaus.net" is the fully qualified domain name of the host, or FQDN.

Network Structure

There can be domains within domains. In this example the domain "interaus" exists within the domain "net". There can be more than one domain within a domain. For example within the au domain there are the domains com, gov, edu, oz, csiro, org, net among others.

 

Within each of these domains there are more domains. In the edu.au domain there is "unsw" (University of NSW) and "mu" (Melbourne University). As can be seen, the domain name system represents a tree structure:

 

 



        |
        +-------------------------------+-------------+
        |                               |             |
        au (Australia)              nz (New Zealand)  fr (France)
        |
	+-----+-----+-----+-----+-----+
	|     |     |     |     |     |
	gov   edu   com    oz   net  csiro
	|
	+--+--+
	|     |
	unsw   mu


      

History

This domain hierarchy has been in existence for less than 10 years. Up until then the Internet was maintained by the U.S. Network Information Center (NIC). The NIC maintained a single file containing the names of all computers connected to the Internet and their corresponding addresses. This file (called HOSTS.TXT) was routinely distributed (via FTP and email) to all computers connected to the Internet. Naturally, as the Internet grew, the bandwidth consumed by the distribution became unbearable. The load on the NIC alone was unmaintainable. The NIC soon distributed HOSTS.TXT to a smaller number of nominated hosts. All other computers on the net downloaded it from the nearest nominated host. Even then the load on the NIC was considerable and there was still too much bandwidth being used.

Another major problem with this system was that no two hosts on the net could have the same name. While the NIC allocated addresses it had no control over the names of hosts. If some host on the net appeared with the same name as another host then traffic would be disrupted.

The face of the Internet was changing as well. Many smaller single user computers were being connected to the net, normally within companies The administrators there wanted the flexibility to be able to change the addresses and host names around as needed. They could not afford to wait on the latest distribution of HOSTS.TXT from the NIC which itself might be out of date by the time it arrived.

The domain name system was proposed in 1987 to the Internet community at large by way of a document known as RFC-1034.

Addresses

Every node on the Internet has a unique IP (internet protocol) address (or IP number). The IP address for www.interaus.net is 198.142.12.202. Addresses are all of the same format; they are made up of four sections of numbers, each from 0 to 225, separated by a dot.


	  1111.2222.3333.4444
	  

1st 2nd 3rd 4th

Every IP address contains a network part and a host part but the format is not the same in every IP address. This varies according to the class of the address. There are four classes of address, A, B, C and D.


	  network section
	  | |
	  109.xxx.xxx.xxx
	  |         |
	  host section

    

	  network section
	  |     |
	  163.xxx.xxx.xxx
	  |     |
	  host section

    

	  network section
	  |         |
	  202.xxx.xxx.xxx
	  | |
	  host section

    

As can be seen there is a limit to the number of hosts and networks available. One of the shortfalls of the current Internet Protocol (IPv4) is that the address space is running out. There is a new version of the Internet Protocol (IPv6 or IPNG) which is not yet in common use. This new version addresses this and other problems in the current version.

Name Servers, Domains and Zones

The software containing information about particular domains is called the name server software. The host running the name server software is usually known as the name server. The name server contains the addresses of the hosts for which it is responsible. Sometimes this is every host in the domain. However, a name server will often contain information about only part of a domain. This part is called a zone. The name server with the information about a zone is said to have authority over that zone. It may not know about every host in the zone but it knows which servers in the zone have this information. For example, a large company may have several subdomains set up for different activities, each subdomain containing several hosts. If the domain of the company is "company.com.au" the subdomains may be "finance.company.com.au", "marketing.company.com.au" and "research.company.com.au". There need only be one name server maintaining these subdomains. This name server will have knowledge of all hosts in all domains. As the company grows it may find that the research department wants to have control of its own hosts. The subdomain "research.company.com.au" can be delegated to the research department and research will run its own name server. The name server at "company.com.au" will then have knowledge of all hosts in "marketing.company.com.au" and "finance.company.com.au" and will know about the name server at "research.company.com.au". It is said to have authority of the zone "company.com.au". The name server at "research.company.com.au" has authority of that zone.

How It All Works

All Internet clients, such as web browsers and FTP clients, use the name server to resolve names to addresses. The client uses what is called resolver software, or just the resolver, to achieve this. The resolver is usually a set of libraries linked into the client program. The resolver sends a query to the local name server. If the name server already knows the answer it responds immediately. Otherwise the name server must query the root name server (the one at the top of the tree) for the information. There are actually several root name servers to ease the load. The root name server will know which name server has authority for the name in question and will usually forward the request there. That name server may not know an authoritative answer but it will know a name server that does. The request is sent down the tree until an authoritative name server is found and a real response can be sent back. This is called the recursive method of resolving a name. The other method, called the iterative method, involves the resolver doing more work. Using the iterative method the name server will respond with either the address of the host in question or the address of another name server which has better knowledge.

Caching

Talking to one of the root name servers for each query is an expensive process. It takes time and consumes bandwidth. When web browsing it would be annoying waiting for a query response for every single page especially if one is following links on the same host or going to a popular search page. To speed up the process name servers cache information. Whenever a name server receives information about another host it keeps that information for a period of time. If another client needs the same address the name server does not need to query the root name server. Furthermore it is clever enough to know that a query for the address of host "sun.research.company.com.au" can be found at the same name server as a query for the address of host "apollo.research.company.com.au". The length of time this data is kept is at the discretion of the administrator of the host. It is transmitted in the answer to the query in the TTL (time to live) field and can vary from seconds to weeks. As it is the root name servers receive about 6 queries per second.

Aliases

There can be more than one name associated with a host. It is often desirable to have the name of a host associated with the service it is providing. For example, web servers tend to run on hosts called "www", ftp sites are often on hosts called "ftp" and mail goes to a host called "mail" , "mailer", or similar. DNS allows aliases to be specified for hostnames. A host may have more than one alias. Very often the same host in an organisation will run the news server, the web server and the anonymous ftp server and may go by the aliases "news", "ftp" and "www". This host will usually have another name as well which is not related to any service but is bound to the IP address of that host. This name is known as the canonical name of the host. A very simplified example of this concept is:

	  server.company.com.au		194.18.66.2		<- canonical name

news.company.com.au server.company.com.au <- news alias

mail.company.com.au server.company.com.au <- mail alias

www.company.com.au server.company.com.au <- web server alias

Electronic Mail and MX Records

Mail is treated differently to other internet services. One reason is that mail is a non interactive service and many sites have mail without having a live IP connection. Name servers have special entries for mail. These are called MX records. MX stands for Mail eXchanger. An MX record specifies which host processes and forwards mail for a domain. This is why an email addresses can usually be simplified to the form name@domain. The mail server program running on the mail exchanger host will know what to do when it receives mail.

Some Tools

Two programs that let you manually lookup addresses are "nslookup" and "dig". These both run on many Unix systems. They allow you to query name servers about qualified domain names. These programs can be used to retrieve all information available about a FQDN from the authoritative name server.

 

In Conclusion

What I have presented here barely scrapes the surface of DNS. It presents some of the concepts and the basic way name servers work. It is useful to have some sort of understanding of what domain names are and how they are used. DNS provides a service to the services so that humans can get on with doing human things and not have to worry about the intricate details of computer addresses.

 

Byline

By Robi Karp. Robi is a software engineer with about 10 years industry experience. He is technical director of Fluffy Spider Technologies Pty. Ltd. and is currently on contract. He can be contacted via email: robi@fluffyspider.com.au