Hacking the Infrastructure – How DNS works – Part 2

Welcome back. In part 1, I discussed the technical details of how DNS works. In this part, I’ll introduce you to some of the more common DNS server packages. In a future post I will cover some of the common problems with DNS as well as proposed solutions. So let’s dive right in.

The most popular DNS server, by far, is BIND, the Berkley Internet Name Domain. BIND has long and storied past. On the one hand, it’s one of the oldest packages for serving DNS, dating back to the early 1980’s, and on the other, it has a reputation for being one of the most insecure. BIND started out as a graduate student project at the University of California at Berkley, and was maintained by the Computer Systems Research Group. In the late 1980’s, the Digital Equipment Corporation helped with development. Shortly after that, Paul Vixie became the primary developer and eventually formed the Internet Systems Consortium which maintains BIND to this day.

Being the most popular DNS software out there, BIND suffers from the same malady that affects Microsoft Windows. It’s the most popular, most widely installed, and, as a result, hackers can gain the most by breaking it. In short, it’s the most targeted of DNS server softwares. Unlike Windows, however, BIND is open source and should benefit from the extra scrutiny that usually entails, but, alas, it appears that BIND is pretty tightly controlled by the ISC. From the ISC site, I do not see any publicly accessible software repository, no open discussion of code changes, and nothing else that really marks a truly open source application. The only open-source bits I see are a users mailing list and source code downloads. Beyond that, it appears that you either need to be a member of the “Bind Forum,” or wait for new releases with little or no input.

Not being an active user of BIND, I cannot comment too much on the current state of BIND other than what I can find publicly available. I do know that BIND supports just about every DNS convention there is out there. That includes standard DNS, DNSSEC, TSIG, and IPv6. The latter three of these are relatively new. In fact, the current major version of BIND, version 9, was written from the ground up specifically for DNSSEC support.

In late 1999, Daniel J. Bernstein, a professor at the University of Illinois, wrote a suite of DNS tools known as djbdns. Bernstein is a mathematician, cryptographer, and a security expert. He used all of these skills to produce a complete DNS server that he claimed had no security holes in it. He went as far as offering a security guarantee, promising to pay $1000 to the first person to identify a verifiable security hole in djbdns. To date, no one has been able to claim that money. As recently as 2004, djbdns was the second most popular DNS server software.

The primary reason for the existence of djbdns is Bernstein’s dissatisfaction with BIND and the numerous security problems therein. Having both security and simplicity in mind, Bernstein was able to make djbdns extremely stable and secure. In fact, djbdns was unaffected by the recent Kaminsky vulnerability, which affected both BIND and Microsoft DNS. Additionally, configuration and maintenance are both simple, straightforward processes.

On the other hand, the simplicity of djbdns may become its eventual downfall. Bernstein is critical of both DNSSEC and IPv6 and has offered no support for either of these. While some semblance of IPv6 support was added via a patch provided by a third party, I am unaware of any third-party DNSSEC support. Let me be clear, however, while the IPv6 patch does add additional support for IPv6, djbdns itself can already handle serving the AAAA records required for IPv6. The difference is that djbdns only talks over IPv4 transport while the patch adds support for IPv6 transport.

Currently, it is unclear at to whether Bernstein will ever release a new version of djbdns with support for any type of “secure” DNS.

The Microsoft DNS server has existed since Windows NT 3.51 was shipped back in 1995. It was included as part of the Microsoft BackOffice, a collection of software intended for use by small businesses. As of 2004, it was the third most popular DNS server software. According to Wikipedia, Microsoft DNS is based on BIND 4.3 with, of course, lots of Microsoft extensions. Microsoft DNS has become more and more important with new releases of Windows Server. Microsoft’s Active Directory relies heavily on Microsoft DNS and the dynamic DNS capabilities included. Active Directory uses a number of special DNS entries to identify services and allow machines to locate them. It’s an acceptable use of DNS, to be sure, but really makes things quite messy and somewhat difficult to understand.

I used Microsoft DNS for a period of time after Windows 2000 was released. At the time, I was managing a small dial-up network and we used Active
Directory and Steel-Belted RADIUS for authentication. Active Directory integration allowed us to easily synchronize data between the two sites we had, or so I thought. Because we were using Active Directory, the easiest thing to do was to use Microsoft DNS for our domain data and as a cache for customers. As we found out, however, Microsoft DNS suffered from some sort of cache problem that caused it to stop answering DNS queries after a while. We suffered with that problem for a short period of time and eventually switched over to djbdns.

There are a number of other DNS servers out there, both good and bad. I have no experience with any of them other than to know some of them by reputation. Depending on what happens in the future with the security of DNS, however, I predict that a lot of the smaller DNS packages will fall by the wayside. And while I have no practical experience with BIND beyond using it as a simple caching nameserver, I can only wonder why such a package claiming to be open source, but so guarded as it is, maintains its dominance. Perhaps I’m mistaken, but thus far I have found nothing that contradicts my current beliefs.

Next time we’ll discuss some of the more prevalent problems with DNS and DNS security. This will lead into a discussion of DNSSEC and how it works (or, perhaps, doesn’t work) and possible alternatives to DNSSEC. If you have questions and/or comments, please feel free to leave them in the comment section.

Hacking the Infrastructure – How DNS works – Part 1

Education time… I want to learn a bit more about DNS and DNSSEC in particular, so I’m going to write a series of articles about DNS and how it all works. So, let’s start at the beginning. What is DNS, and why do we need it?

DNS, the Domain Name Service, is a hierarchical naming system used primarily on the Internet. In simple terms, DNS is a mechanism by which the numeric addresses assigned to the various computers, routers, etc. are mapped to alphanumeric names, known as domain names. As it turns out, humans tend to be able to remember words a bit easier than numbers. So, for instance, it is easier to remember blog.godshell.com as opposed to 204.10.167.1.

But, I think I’m getting a bit ahead of myself. Let’s start back closer to the beginning. Back when ARPANet was first developed, the developers decided that it would be easier to name the various computers connected to ARPANet, rather than identifying them by number. So, they created a very simplistic mapping system that consisted of name and address pairs written to a text file. Each line of the text file identified a different system. This file became known as the hosts file.

Initially, each system on the network was responsible for their own hosts file, which naturally resulted in a lot of systems either unaware of others, or unable to contact them easily. To remedy this, it was decided to make an “official” version of the hosts file and store it in a central location. Each node on ARPANet then downloaded the hosts file at a fairly regular interval, keeping the entire network mostly in-sync with new additions. As ARPANet began to grow and expand, the hosts file grew larger. Eventually, the rapid growth of ARPANet made updating and distributing the hosts file a difficult endeavor. A new system was needed.

In 1983, Paul Mockapetris, one of the early ARPANet pioneers, worked to develop the first implementation of DNS, called Jeeves. Paul wrote RFC 882 and RFC 883, the original RFCs describing DNS and how it should work. RFC 882 describes DNS itself and what it aims to achieve. It describes the hierarchical structure of DNS as well as the various identifiers used. RFC 883 describes the initial implementation details of DNS. These details include items such as message formats, field formats, and timeout values. Jeeves was based on these two initial RFCs.

So now that we know what DNS is and why it was developed, let’s learn a bit about how it works.

DNS is a hierarchical system. This means that the names are assigned in an ordered, logical manner. As you are likely aware, domain names are generally strings of words, known as labels, connected by a period, such as blog.godshell.com. The rightmost label is known as the top-level domain. Each label to the left is a sub-domain of the label to the right. For the domain name blog.godshell.com, com is the top-level domain, godshell is a sub-domain of com, and blog is a sub-domain of godshell.com. Information about domain names is stored in the name server in a structure called a resource record.

Each domain, be it a top level domain, or a sub-domain, is controlled by a name server. Some name servers control a series of domains, while others control a single domain. These various areas of control are called zones. A name server that is ultimately responsible for a given zone is known as an authoritative name server. Note, multiple zones can be handled by a single name server, and multiple name servers can be authoritative for the same zone, though they should be in primary and backup roles.

Using our blog.godshell.com example, the com top-level domain is in one zone, while godshell.com and blog.godshell.com are in another. There is another zone as well, though you likely don’t see it. That zone is the root-zone, usually represented by a single period after the full domain name, though almost all modern internet programs automatically append the period at the end, making it unnecessary to specify it explicitly. The root-zone is pretty important, too, as it essentially ties together all of the various domains. You’ll see what I mean in a moment.

Ok, so we have domains and zones. We know that zones are handled individually by different name servers, so we can infer that the name servers talk to each other somehow. If we infer further, we can guess that a single name resolution probably involves more than two name servers. So how exactly does all of this work? Well, that process depends on the type of query being used to perform the name resolution.

There are two types of queries, recursive and non-recursive. The query type is negotiated by the resolver, the software responsible for performing the name resolution. The simpler of the two queries is the non-recursive query. Simply put, the resolver asks the name server for non-recursive resolution and gets an immediate answer back. That answer is generally the best answer the name server can give. If, for instance, the name server queried was a caching name server, it is possible that the domain you requested was resolved before. If so, then the correct answer can be given. If not, then you will get the best information the name server can provide which is usually a pointer to a name server that will know more about that domain. I’ll cover caching more a little later.

Recursive queries are probably the most common type of query. A recursive query aims to completely resolve a given domain name. It does this by following a few simple steps. Resolution begins with the rightmost label and moves left.

  1. The resolver asks one of the root name servers (that handle the root-zone) for resolution of the rightmost label. The root server responds with the address of a server who can provide more information about that domain label.
  2. Query the next server about the next label to the left. Again, the server will respond with the address of a server that will know more about that domain label, or, possibly, an authoritative answer for the domain.
  3. Repeat step 2 until the final answer is given.

These steps are rather simplistic, but give a general idea of how DNS works. Let’s look at an example of how this works. For this example, I will be using the dig command, a standard Linux command commonly used to debug DNS. To simplify things, I’m going to use the +trace option which does a complete recursive lookup, printing the responses along the way.

$ dig +trace blog.godshell.com

; <<>> DiG 9.4.2-P2 <<>> +trace blog.godshell.com
;; global options: printcmd
. 82502 IN NS i.root-servers.net.
. 82502 IN NS e.root-servers.net.
. 82502 IN NS h.root-servers.net.
. 82502 IN NS g.root-servers.net.
. 82502 IN NS m.root-servers.net.
. 82502 IN NS a.root-servers.net.
. 82502 IN NS k.root-servers.net.
. 82502 IN NS c.root-servers.net.
. 82502 IN NS j.root-servers.net.
. 82502 IN NS d.root-servers.net.
. 82502 IN NS f.root-servers.net.
. 82502 IN NS l.root-servers.net.
. 82502 IN NS b.root-servers.net.
;; Received 401 bytes from 192.168.1.1#53(192.168.1.1) in 5 ms

This first snippet shows the very first query sent to the local name server (192.168.1.1) which is defined on the system I’m querying from. This is often configured automatically via DHCP, or hand-entered when setting up the computer for the first time. This output has a number of fields, so let’s take a quick look at them. First, any line preceded by a semicolon is a comment. Comments generally contain useful information on what was queried, what options were used, and even what type of information is being returned.

The rest of the lines above are responses from the name server. As can be seen from the output, the name server responded with numerous results, 13 in all. Multiple results is common and means the same information is duplicated on multiple servers, commonly for load balancing and redundancy. The fields, from left to right, are as follows : domain, TTL, class, record type, answer. The domain field is the current domain being looked up. In the example above, we’re starting at the far right of our domain with the root domain (defined by a single period).

TTL stands for Time To Live. This field defines the number of seconds this data is good for. This information is mostly intended for caching name servers. It lets the cache know how much time has to pass before the cache must look up the answer again. This greatly reduces DNS load on the Internet as a whole, as well as decreasing the time it takes to obtain name resolution.

The class field defines the query class used. Query classes can be IN (Internet), CH (Chaos), HS (Hesiod), or a few others. Generally speaking, most queries are of the Internet class. Other classes are used for other purposes such as databases.

Record type defines the type of record you’re looking at. There are a number of these, the most common being A, PTR, CNAME, MX, and NS. An A record is ultimately what most name resolution is after. It defines a mapping from a domain name to an IP address. A PTR record is the opposite of an A record. It defines the mapping of an IP Address to a domain name. CNAME is a Canonical name record, essentially an alias for another record. MX is a mail exchanger record which defines the name of a server responsible for mail for the domain being queried. And finally, an NS record is a name server record. These records generally define the name server responsible for a given domain.

com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
;; Received 495 bytes from 199.7.83.42#53(l.root-servers.net) in 45 ms

Our local resolver has randomly chosen an answer from the previous response and queried that name server (l.root-servers.net) for the com domain. Again, we received 13 responses. This time, we are pointed to the gtld servers, owned by Network Solutions. The gtld servers are responsible for the .com and .net top-level domains. These are two of the most popular TLDs available.

godshell.com. 172800 IN NS ns1.emcyber.com.
godshell.com. 172800 IN NS ns2.incyberspace.com.
;; Received 124 bytes from 192.55.83.30#53(m.gtld-servers.net) in 149 ms

Again, our local resolver has chosen a random answer (m.gtld-servers.net) and queried for the next part of the domain, godshell.com. This time, we are told that there are only two servers responsible for that domain.

blog.godshell.com. 3600 IN A 204.10.167.1
godshell.com. 3600 IN NS ns1.godshell.com.
godshell.com. 3600 IN NS ns2.godshell.com.
;; Received 119 bytes from 204.10.167.61#53(ns2.incyberspace.com) in 23 ms

Finally, we randomly choose a response from before and query again. This time we receive three records in response, an A record and two NS records. The A record is the answer we were ultimately looking for. The two NS records are authority records, I believe. Authority records define which name servers are authoritative for a given domain. They are ultimately responsible for giving the “right” answer.

That’s really DNS in a nutshell. There’s a lot more, of course, and we’ll cover more in the future. Next time, I’ll cover the major flavors of name server software and delve into some of the problems with DNS today. So, thanks for stickin’ around! Hopefully you found this informative and useful. If you have questions and/or comments, please feel free to leave them in the comment section.