DNS Cache Poisoning: Testing & Verifying the Patch

It seems there is some confusion surrounding how to test for the DNS flaw, and/or confirming that a patch is working.

Unfortunately, just because your DNS server is patched, it may not be entirely safe. The use of NAT may be interfering. Many NAT devices are reducing the randomness of the source UDP port of queries. More about the NAT problem can be found in my other posting.

Here are my suggestions for testing your DNS. Section 1 (“browser”) and Section 2 (“commandline”) are good tests for checking

  1. that recursive queries are not affected as they leave your network (e.g. via NAT), and
  2. that an upstream DNS server (e.g. one that your DNS server may forward all requests to) is not vulnerable.

1. From a Web Browser

You can test the path of DNS servers from your browser to a test server, using one of the following web pages.

2. From a Unix/Cygwin Commandline

If you have access to a Unix or Cygwin commandline on a computer whose DNS path you want to test, you can perform a special DNS query. You’ll need either “dig” or “nslookup” installed.

If you have dig available, use the following command, which is taken from this page at DNS-OARC.

dig +short porttest.dns-oarc.net TXT

If you want to be more specific about the DNS server you wish to test, specify it on the dig commandline as follows. For example, if you run this command on the same server that runs your DNS software, you should use “@localhost” or “@127.0.0.1” for the query. Otherwise, specify the hostname or the IP address of the server you want to test (e.g. “@dns.example.com” or “@192.168.1.1”).

dig +short @127.0.0.1 porttest.dns-oarc.net TXT

If you have only nslookup available (e.g. Windows DOS prompt), then you can try the following. Bold text indicates what you type in.

C:\Documents and Settings\gitm> nslookup
Default Server: dns.yourdomain.com
Address: 192.168.1.1
> set type=txt
> porttest.dns-oarc.net
Default Server: dns.yourdomain.com
Address: 192.168.1.1
[your results show up here]
> exit

To interpretation your results, check this page at DNS-OARC. Essentially, you are looking for a high standard deviation, as reported by a “GREAT” result. A result of “POOR” is not good.

3. TCP Dump

This section will help you verify that your DNS server is patched. I’ll use “tcpdump” on Linux as an example, but you can also use “snoop” on Solaris. Other Unix operating systems will have the same or a similar tool. The “tcpdump” command must be run by a system administrator who has root user access on the DNS server.

First, we’ll initiate a tcpdump session on the server that is running the DNS software. In my case, I am using ISC Bind. Additionally, my DNS server receives about 10,000 queries per second, so I want to view only the relevant queries to my test. For that, I’ll need to filter the output based on the toorrr.com destination server I will be querying. Note: keep the 149.20.56.5 IP address in the command.

tcpdump -nn host 149.20.56.5

Now, in another terminal window, type the following command, which is similar to the one used by the DOXPARA web page listed above. If you run the command on the same server that runs your DNS software, you should use “@localhost” or “@127.0.0.1” for the query. Otherwise, specify the hostname or the IP address of the server you want to test (e.g. “@dns.example.com” or “@192.168.1.1”). The “date” subcommand is used to prevent hostname caching on the DNS server you are testing.

dig @127.0.0.1 $(date +%s).doxdns5.com

The “dig” command should output a series of recursive CNAME lookups.

The “tcpdump” output should look similar to the following. In this example, 192.168.1.1 is the IP address of the DNS server that I am testing, and 149.20.56.5 is the DNS server for the doxdns5.com domain.

22:49:42.870301 IP 192.168.1.1.45399 > 149.20.56.5.53:  28554 [1au] A? 1217396982.doxdns5.com. (50)
22:49:42.992076 IP 149.20.56.5.53    > 192.168.1.1.45399:  28554*- 1/0/0 CNAME[|domain]
22:49:42.992635 IP 192.168.1.1.16585 > 149.20.56.5.53:  13098 [1au][|domain]
22:49:43.101637 IP 149.20.56.5.53    > 192.168.1.1.16585:  13098*-[|domain]
22:49:43.102151 IP 192.168.1.1.41503 > 149.20.56.5.53:  10725 [1au][|domain]
22:49:43.216196 IP 149.20.56.5.53    > 192.168.1.1.41503:  10725*-[|domain]
22:49:43.216671 IP 192.168.1.1.8699  > 149.20.56.5.53:  21053 [1au][|domain]
22:49:43.327506 IP 149.20.56.5.53    > 192.168.1.1.8699:   21053*-[|domain]
22:49:43.327997 IP 192.168.1.1.55354 > 149.20.56.5.53:  59611 [1au][|domain]
22:49:43.436943 IP 149.20.56.5.53    > 192.168.1.1.55354:  59611*-[|domain]

Each pair of lines in the output represents a query and a response. A query goes to the doxdns5.com DNS server and a reply comes back to my DNS server. The bold numbers in red are the source UDP port numbers of queries leaving my DNS server. The bold numbers in blue are the standard UDP port number (53) for the dns/domain service on the doxdns5.com DNS server.

Note the randomness of the source UDP ports in red (45399, 16585, 41503, 8699, 55354). This indicates that my DNS server is patched. If the source port number was the same for all queries, that would indicate an unpatched server.

But wait! There’s more.

Just because my DNS server is patched and sending out queries from random UDP source ports, it does not mean I am out of the woods yet. I still need to verify that the source port numbers still look randomized at the destination DNS server at doxdns5.com. For that, I will need to use one of the test methods (web-based or commandline) described in sections 1 or 2 above.

It’s entirely plausible that a NAT device (e.g. your DSL/cable router, a Cisco router, etc) is rewriting the random source ports to a not-so-random sequence. Many NAT products I have tried will rewrite the random source port numbers to a predictable sequential series of source port numbers.

For example, clicking the “Check My DNS” link on this DOXPARA web page, here’s what the page reported to me (slightly altered to protect my DNS server’s identity).

Your name server, at 1.2.3.4, may be safe, but the
NAT/Firewall in front of it appears to be interfering with
its port selection policy.  The difference between largest
port and smallest port was only 5.
Please talk to your firewall or gateway vendor -- all are
working on patches, mitigations, and workarounds.
1.2.3.4:16121 TXID=21918
1.2.3.4:16123 TXID=55556
1.2.3.4:16124 TXID=45625
1.2.3.4:16125 TXID=8942
1.2.3.4:16126 TXID=359

Note how the source UDP ports in bold red are nearly sequentially numbered. In this case, I will need to get a patch or firmware update from the manufacturer of my NAT device. Unfortunately, they do not yet have a patch available.

The Demise of the Schwab Yield Plus Fund

In hind sight, it’s interesting to occasionally look back on some of my investment decisions.  Sometimes, I make good ones, sometimes bad ones.

Here’s an example of a bad one.

About 10 years ago, I left my then employer with about $1500 in company stock I had earned as part of some retirement benefit.  I considered selling it and buying a more diversified S&P 500 index fund.  But I ultimately decided to just keep the stock.  It would give me a reason to stay abreast of the company and of how my former coworkers were faring out.  Over the subsequent 2 years, I watched as the stock dropped from $20 per share, to $10, then $5.  Ultimately, the company went into Chapter 11 and the stock was declared worthless.  In hind sight, I should have diversified.  Oh well.

Here’s another example.  A somewhat good decision.

The Schwab YieldPlus Investor Fund (SWYPX) was touted as an ultra-short bond fund which offered slightly higher yields with minimal price fluctuation.  (As of today, it still claims the same.)  I felt it was a decent position for money that I would otherwise put into a money market fund.  We all know that risk is compensated with higher return.  The yield on SWYPX was indeed slightly better, and the risk must only have been minimally higher.  Right?  Wrong.

I had known that the housing market was a bubble for years in the making.  I’m not just saying that from a backward-looking perspective.  Trust me.  I was talking about a housing bubble in 2001, and probably as early as 1999.  Friends and some family members got tired of me talking about the “bubble,” to the point that I was asked to no longer speak of it.  To this day, I can still no longer talk about “housing” with at least one of these friends.  Well, we know how that housing thing all turned out.

July 2007.  That was the month when things really hit the fan.  The signs were there for months.  Noone should have been caught by surprise.  But as is often the case, many were blind-sided.  Little mini quakes had already been felt in the MBS/CDO/ABS markets for months before that.

I had heard about the “toxic waste” found in many bond portforlios, but I had not really stopped to consider what that really meant.  I just knew that I should avoid it.  For at least the prior 2 or 3 years, I had avoided any bond funds that carried mortgage-backed securities (MBS), except for one.  I had started investing in the Schwab Yield Plus fund one year earlier, in July 2006.  It was diversified across mortgage-backed securities (MBS), collateralized debt obligations (CDO), corporate bonds, commerical paper, and asset-backed securities (ABS).  Looking at the ratings of the underlying securities, I felt fairly confident that I was not exposed to any significant levels of toxic waste.  Many of the securities were highly-rated (AAA, AA, A).  And this was an ultra-short bond fund.  Ultra-short is supposed to be less risky, and I did not have to worry about the interest rate risk normally inherent in longer-term bond funds.

Anyway, from July 2007 to Novermber 2007, I had watched the slow and steady decline of the fund’s NAV from 9.67 to 9.17.  I was concerned.  That was a decline of just more than 5% in 4 months, for a fund that was supposed to be safe.  Something was not right, and at the end of Novermeber 2007, I sold a significant portion of my holdings in the fund.  I sold off the remainder in January 2008 at a price of 8.93.

So, why did I sell?  Two reasons.

  1. The collapse of the two Bear Stearns hedge funds that were heavily invested in subprime securities.
  2. The re-rating of debt obligations by the rating agencies (Moody’s, S&P, etc).

At some point in 2007, Bear Stearns ceased redemptions on two of its subprime hedge funds.  Investors of those funds were running for the exits with their money, and fast.  Those funds suffered major losses.  The problem with redemptions is that it forces the fund managers to sell securities they may not want to sell.  To keep the fund’s NAV stable, they may initially sell off their better assets.  But at some point, they are forced to sell the bad stuff.  And that is exactly what happened in March 2008 to the Schwab Yield Plus fund.  They had to sell off assets at fire-sale prices, because the market for those securities had dried up.  There were very few buyers willing to buy, so they sold at very deep discounts.

Toward the end of 2007, the rating agencies had begun to re-rate debt securities.  Apparently, obligations that had previously been considered safe/secure were now significantly more risky.  Despite Schwab’s re-assurances that the Yield Plus fund was not exposed to subprime by more than 5%, it was enough for me to sell.  With word of further re-rating to come from the agencies, it was anyone’s guess what the true quality of the SWYPX fund was, or of what the true percentage exposure to subprime was to the fund.

I don’t make rash decisions about investing.  I’ll stay the course, … to a point.  I had considered my position in the fund, the relatively small loss I had already incurred.  But ultimately, I decided to bail on the Schwab Yield Plus fund in January 2008.

It turns out that I picked a relatively good time to bail out of the fund.

Within the months that followed, other investors did the same.  An estimated $11 billion has left the fund in the past year, or about 80% of the fund’s assets.

Just in the month of March 2008 alone, the fund was down -17.96% for the month.  For the past year, it’s down -29.18%.  Ouch!  So much for a slightly better return with minimal price risk.  No wonder why Schwab is now facing dozens of law suits from unhappy investors in the fund.

The lyrics of Kenny Roger’s “The Gambler” come to mind:

You got to know when to hold em, know when to fold em,
Know when to walk away and know when to run.

I look foward to reading the fund’s annual report next month.  That should be interesting.

Links:

DNS Cache Poisoning: NAT Interference

After patching my DNS servers, I went about testing that the patch was working. I confirmed this with a tcpdump of DNS traffic. The source ports of outgoing queries leaving the DNS servers were indeed being set to a randomized UDP port.

However, I then tested my DNS server using the Check My DNS tool on Kaminsky’s blog site. I was surprised to see that the source port was no longer randomized. As the DNS queries leave our private corporate network for the public Internet, our NAT gateway is rewriting the random UDP source ports to a predictable sequential series of source port numbers: 14756, 14757, 14758, etc. Apparently, the NAT gateway is reducing the effectiveness of the additional entropy introduced by the random source ports.

More detail about how I performed the tests can be found in this blog posting.

So, are we at risk? Yes, and no.

We’re definitely better off than we were before the patch. As pointed out in a previous post, anyone who can make a recursive query can poison an unpatched DNS server. In our case, the DNS server is patched. That’s good. If an attack is operating entirely within our private network against our private DNS server, then we’re safe. It will be very difficult for an attacker to properly guess which random port the DNS server is using for a given query.

However, if the attack is being coordinated from both within the private network and from the public Internet, then we are still at a somewhat high risk of poisoning. The attacker can take advantage of the reduced source port entropy of DNS queries passing through NAT. The attacker need only initiate a recursive query from within the private network, while simultaneously attempting to poison the response from the public Internet. Because the query’s source port is no longer randomized following NAT translation, the attacker would need to send bogus replies to a small series of sequentially numbered ports on the NAT’s external IP address.

Unfortunately, our NAT gateway vendor does not yet have a patch available.

In the meantime, I will install a new DNS server in our corporate DMZ network to which all internal DNS servers will forward recursive queries to.

DNS Cache Poisoning: Risks

Today, I had a conversation with a friend who wanted to know how the DNS cache poisoning attack works. I regurgitated what I had already read on Kaminky’s blog post on the subject, as well as the deleted Matasano blog posting (which is now available elsewhere on the Internet).

The next question that came to mind: “Who is at risk?”

My simple answer: “Any unpatched server that provides DNS query recursion service is vulnerable to this new form of cache poisoning.”

In short, “unpatched recursive servers.”

But what does that really mean in terms of specific network configurations?

To understand this, I prefer to think of this vulnerability in terms of who/what/where the poisoning threat might come from.

As pointed out in a previous post, the full details of the problem will not be disclosed until August 7. However, enough has been revealed following the accidental leak by Matasano Security, such that I will try to shed some light on the risk.

Unfortunately, due to the nature of how a DNS infrastructure works, it’s not necessarily obvious whether someone is at risk or not. There can be many layers of DNS query recursion, forwarding, and referring. A server is only as safe as the weakest link in that chain. To help explain what the impact might be in different situations, here are some scenarios I came up with.

Scenario 1: Authoritative-Only DNS Server

If a server does not perform any recursion on behalf of other computers, then it is not affected. Such a server is often referred to as an authoritative-only server for a particular domain. It provides answers to queries of hostnames contained within the domain it is authoritative for. If a query is made for a hostname outside of that domain, the server will tell the asker to look elsewhere. For example, I can ask the Yahoo DNS server for the IP address of www.yahoo.com, and it will respond with 209.131.36.158. But if I ask the Yahoo DNS server for the IP address of www.google.com, it will tell me to look elsewhere.

It is important to note that most publicly available authoritative DNS servers are authoritative-only, and thus they do not provide any recursion services. However, a server can be authoritative for a domain and provide recursion services, but such a server is usually not publicly available.

In the case of an authoritative-only DNS server, it does not need to be patched.

Scenario 2: Publicly Available Recursive DNS Server

Some DNS servers which are available to the whole Internet are also configured to allow recursion to anyone on the entire Internet. This is bad. In terms of the recent DNS flaw, these servers are most vulnerable, as the risk of poisoning can come from anywhere on the Internet. Put simply, anyone who is allowed to perform a recursive query on a DNS server can also poison it.

Such a server must be patched.

Scenario 3: Private Recursive DNS Server

An unpatched recursive server is vulnerable and can be exploited. Period.

Even if the recursive server is available only to a small group of computers, that small group of computers can be used to exploit an unpatched DNS server. Remember, it takes only 1 computer capable of making recursrive queries to poison the cache.

But what are the chances this could actually happen? Your chances will depend on a number of factors.

  1. The level of trust attributed to the computers capable of making recursive queries.
  2. The level of trust attributed to the users capable of making recursive queries, as well as the knowledge, skill, and motivations of those users.
  3. The number of computers capable of making recursive queries. The greater the number of computers, the greater the number of threat sources.
  4. The nature of the DNS service providing access to recursive queries. Some DNS services do not maintain a cache. In such cases, the cache-poisoning attack might affect only a single query, but it cannot poison a non-existent cache.

Let’s examine some specific examples.

Private Home Network

If you have a private home network, then chances are that you know who your users are, and you know what level of trust you have in those users. In many cases, the users are your spouse, your children, your roommates. Generally, you probably can trust these people.

But you are probably not running your own DNS server in such a network. You’re probably using a connection-sharing device. See below.

Broadband Connection Sharing Devices

These are usually consumer-grade DSL/cable Internet sharing devices, such as those made by Linksys, Netgrear, Dlink, etc. Many of these devices are configured to use your ISP’s DNS servers. I do not believe the device itself maintains a cache of its own, and so it alone may not be vulnerable. (Apparently, some of these devices do maintain a cache. A comment posted to this blog page indicates that his Linksys WT54G router uses dnsmasq to maintain a small cache.)

Nonetheless, I suspect that many of these devices are configured to forward queries to the ISP’s DNS server, which may or may not be vulnerable. Refer to the ISP section below.

Your ISP

If you have an Internet connection, especially a home DSL or cable connection, then chances are high that you are using your Internet service provider’s DNS servers to answer hostname queries for web sites you visit (www.google.com, www.yahoo.com, www.hotmail.com, etc). Often times, the DNS server is available for performing recursive queries on behalf of all customers of that ISP.

While not as high a risk as a wide open recursive server available to anyone on the Internet, the risk is still pretty high. The larger your ISP, the greater the risk that a poisoning attack can come from any one of the ISP’s customers.

The largest ISPs in the United States are SBC, Comcast, Verizon, and Time Warner. Collectively, these large ISPs represent greater than 50% of the market share. They are especially vulnerable.

Your ISP’s servers must be patched.

Open Wi-Fi Home Network

If you run an open wi-fi network, you are at risk if your network hosts its own caching, recursive DNS server. Trust level of your users is unknown. If anyone can access the open network, then anyone can poison it.

Note that many open wi-fi networks are created using a consumer broadband sharing device (see above), so the device itself might not be vulnerable. However, you are exposing your ISP’s unpatched DNS servers to attack.

School or University / Private Corporation

Many organizations provide private recursive DNS servers, which are completely hidden from the Internet. The private servers are configured to provide recursive service to the organization’s private networks only.

This is identical to the ISP example above.

I would be especially concerned on a university network. University students are usually smart and they may be highly-tempted to test their campus DNS servers.

Scenario 4: Forwarding-Only DNS Server

Some DNS servers do not do any recursion themselves. They are configured to forward all queries to another DNS server. I am uncertain whether a forwarding-only server is vulnerable. It probably depends on whether the forwarding server maintains its own cache.

Nonetheless, you are only as safe as the weakest link. So, if your forwarding-only server forwards to a vulnerable DNS server, then the forwarding-only server is affected, albeit indirectly.

DNS Cache Poisoning: Exploit Revealed

The Announcement

Since Dan Kaminsky‘s public announcement on July 8, 2008, regarding a serious DNS vulnerability, much publicity has been granted to this topic by the online press. Many key people in the industry reinforced the notion that this was serious. The message to system administrators was clear: “patch, trust us.”

This announcement was different than the prototypical announcement surrounding a software security vulnerability. Many vendors were already on board months prior, and patches for many DNS software applications were ready to be made public at the same time as the announcement. The unusual nature of the coordination effort and its subsequent announcement implied that this vulnerability was serious.

The good news: organizations had 30 days to patch their vulnerable software. On August 7, Dan will present the full details about the vulnerability at the Black Hat conference in Las Vegas.

30 Days Becomes 13 Days: Exploit Revealed

Unfortunately, that original window of 30 days turned out to be only 13 days. Last week, researcher Halvar Flake began speculating about the cache-poisoning nature of the vulnerability. While he did not get the exact detail of the exploit correct, he apparently got close enough. Shortly thereafter, Matasano Security published a blog entry, filling in some of the details that Halvar missed. The blog post was not intended to be posted before August 7, and it has since been deleted, but plenty of copies still exist on the Internet.

Too late. That was it. The cat was now out of the bag. Within a day or two, exploit code began appearing on the Internet.

As Kaminsky noted on his blog that day, “13>0 … Patch. Today. Now. Yes, stay late.” And that’s just what I did. Like a lot of people, I had dragged my feet on this one. Partly because I felt safe that I would have 30 days to patch.

Note to self: In future, assume 3 days instead of 30. Better yet, assume 0 days.

Links