As Internet uses, we are accustomed to navigating between websites using their unique address names, known as domain names or URLs (although technically URL actually refers to what follows after the domain name). In fact, however, computers communicate with one another exclusively through another unique address system, a series of numbers called an Internet Protocol (IP) address. The massive database which links easily remembered URL names to the meaningful index of IP addresses is called the Domain Name System (DNS), and has been the basis for a usable Internet since it was devised in the early 1980s.
Under the current Internet Protocol, known as IPv4, all computers connected to the Internet identify themselves by means of a unique number or address, consisting of four digits, each between 0 and 255. For example, the IP address of Helium.com, according to the publicly searchable Network Solutions database, is 67.59.146.30. This means that, whenever a person types "www.helium.com" (known as a domain name) into their Web browser, their computer connects to a separate DNS computer, learns that it needs to communicate with the computer with that IP address, and then does so, all without making this process visible to the end user.
- ARPANET and Hosts.txt
When the first version of the Internet (known as the ARPANet) was brought online by the U.S. military and its university partners in the 1970s, the fact that computers were identified by lengthy number sequences was not a problem. A limited number of technical specialists actually needed to use the system, and each needed to communicate only with a small number of other computers. However, as the number of different servers proliferated online, and the userbase gradually transitioned from a small group of computer scientists to a large group of non-specialized users, obviously communication through number addresses quickly became untenable.
Even before the current-day DNS was created, steps were already taken to solve this problem. Computers could download a Hosts file from the Network Information Center (NIC), a project run by the Stanford Research Institute (a non-profit corporation now known as SRI International). The Hosts.txt file contained an index of all This, too, however, could not cope with the rapidly growing Internet. SRI's resources and bandwidth simply could not cope with mounting demands for its Hosts list. In addition, although it maintained the Hosts list, it had no authority for parceling out domain names, meaning there were no checks in place to prevent duplicate domain names from being used and thus creating enormous confusion.
Finally, the SRI system required every computer user to download an entire index of Internet servers on a regular enough basis to stay up to date. Given the limited memory and storage space of the computers of the day, this could foreseeably become a serious problem as the number of servers grew into the hundreds, and then the thousands, and (still much later) the millions.
- The Invention of DNS -
The solution, written in a series of technical papers between 1981 and 1983, was to devise a replacement system, known as the Domain Name System (DNS). In the DNS system, there would be authority over all new domain names, established in a hierarchical fashion. Special software, called the Berkeley Internet Name Domain (BIND), was programmed at the University of California at Berkeley to run the new project. Two new innovations made the system workable: the creation of hierarchical domain names, and the creation of central servers to track the growing number of new names.
A hierarchical domain system meant that the critical decisions about how to parcel out new names would be centralized within various groups. The authors of the new system, Jon Postel and Joyce Reynolds, created a short list of so-called top-level domains, which appear at the end of a domain name: .com, org, .net, .edu, .gov, and so on. A separate list of national top-level domains was subsequently created, such as .ca (for Canada), .uk (for Great Britain), and the always-popular .tv. Initially, the Department of Defence instructed SRI, maintainers of the original Hosts.txt, to handle registration of new domains under these top-level domains (domains like Helium.com, for instance). However, each person or group who registered a domain name was then authorized to create and maintain, under their own authority, any number of subdomains (like en.wikipedia.org, a subdomain of wikipedia.org). As long as each level of the hierarchy could control the creation of the sublevel directly below it, there would be no unwanted duplication. This means that "google.ca" and "google.com" can refer to entirely different Websites, although in practice those particular two are the same.
The second important innovation was the creation of centralized servers, or Domain Name servers, to maintain the new databases. End users would no longer have to download an entire list of domain names and IP addresses in order to use the Internet. Instead, each time we attempt to connect to a new Website, our computer simply connects to the DNS server and requests the one address that is needed.
These policies laid the framework for the Internet ever since, with only minor modifications. Later in the 1980s, ARPANET and SRI were replaced by successors, the NSFNet (the last version of the Internet prior to the current one), administered by Merit Networks. Finally, as the Internet became a public resource rather than a military and academic one, control over domain registrations passed from Merit to a new contractor, GSI, and from there to a company with a much more recognizable name: Network Solutions. Network Solutions levied a $50 fee to purchase a domain name. Ultimately, domain name registration was spun off again, so that not all new domain name registrations go directly through Network Solutions anymore. Ultimate responsibility for domain names and IP addresses now passes through a U.S. government corporation, the Internet Corporation for Assigned Names and Numbers (ICANN).
- DNS Today -
Other than that, however, for the most part, the system for naming unique website addresses remains along the lines which were set down in the 1980s and early 1990s. Thirteen central DNS servers - formerly single computers, but now clusters of computers - maintain comprehensive, up-to-date lists of addresses for the top-level domains. Currently, two of the root nameservers are at VeriSign (VRSN), two at universities, two within the U.S. Department of Defence, one at the U.S. government's Internet Corporation for Assigned Names and Numbers (ICANN), one at NASA, one in Japan (at the WIDE Project), two in Europe (one at RIPE NCC, the other in Sweden), one at the Internet Systems Consortium, and one at Cogent Communications (CCOI).
These central servers, however, are only responsible for maintaining links to the subordinate name servers for each top-level domain (e.g. .com). There are now hundreds of such servers. When an individual connects to the Internet, his or her computer checks the file maintained by the Domain Name Server designated by their Internet service provider, which in turn updates its file as necessary by checking with a server in the level above it, and so on. Although the system is now somewhat more complex than the one described here, these are the essential components of the system.
Two challenges now confront the system we have inherited from the 1980s. The first is that available IP addresses under the IPv4 system (about 4 billion, all told), have nearly all been parcelled out now. A new system of IP address numbers, called IPv6, is now being implemented which should solve this problem with minimal inconvenience to end users. In addition, recently ICANN has begun to open up the top-level domain range for the creation of new domains. In theory, the system allows for the creation of as many unique website addresses as can be demanded by human users.