[ratelimits] Analysis of BIND RRL patch + question

ratelimits at elsif.net ratelimits at elsif.net
Mon Feb 11 18:59:02 UTC 2013


I wanted to do some analysis on how well RRL was protecting us.

First, I want to point out that I made bad assumptions early.

I assumed...

..this was a successful query/response...
11-Feb-2013 12:04:08.235 queries: info: client 192.228.XX.YY#4130 (thisisatestbyjake0.ca): query: thisisatestbyjake0.ca IN A + (192.228.28.9)

..this was a TCP reset response...
11-Feb-2013 12:04:08.374 queries: info: client 192.228.XX.YY#4130 (thisisatestbyjake6.ca): slip NXDOMAIN response to 192.228.XX.0/24 for ca IN A  (0000047e)

..this was an entirely dropped response...
11-Feb-2013 12:04:08.305 queries: info: client 192.228.XX.YY#4130 (thisisatestbyjake5.ca): drop NXDOMAIN response to 192.228.XX.0/24 for ca IN A  (0000047e)

My assumption was bad because the first entry, "query: " line is doing exactly as it always has:  logging the query only.

It took me using a "repeat 10 dig ...", with the ports randomizing, to see that the "slip" and "drop" always correlate to a "query" line.

I had missed this in just watching real-world attacks scroll by, because more often than not, the attacker uses the same port ~25 times.



Next...I wanted to see how effective it was with the default/recommended settings of 5/5 as such:
        rate-limit {
                responses-per-second 5;
                window 5;
        };

Using "dnsperf" to send 10 queries that would all generate the same NXDOMAIN response:
[Timeout] Query timed out: msg id 3
[Timeout] Query timed out: msg id 8
[Timeout] Query timed out: msg id 9
  Percentage completed:  70.00%
  Percentage lost:       30.00%

11-Feb-2013 12:14:10.616 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake0.ca): query: thisisatestbyjake0.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.616 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake1.ca): query: thisisatestbyjake1.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.685 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake2.ca): query: thisisatestbyjake2.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.685 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake4.ca): query: thisisatestbyjake4.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.685 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake5.ca): query: thisisatestbyjake5.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.686 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake3.ca): query: thisisatestbyjake3.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.686 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake3.ca): drop NXDOMAIN response to 192.228.XX.0/24 for ca IN A  (0000047e)
11-Feb-2013 12:14:10.754 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake6.ca): query: thisisatestbyjake6.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.754 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake6.ca): slip NXDOMAIN response to 192.228.XX.0/24 for ca IN A  (0000047e)
11-Feb-2013 12:14:10.754 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake8.ca): query: thisisatestbyjake8.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.754 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake7.ca): query: thisisatestbyjake7.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.755 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake9.ca): query: thisisatestbyjake9.ca IN A + (192.228.28.9)
11-Feb-2013 12:14:10.755 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake8.ca): drop NXDOMAIN response to 192.228.XX.0/24 for ca IN A  (0000047e)
11-Feb-2013 12:14:10.755 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake7.ca): slip NXDOMAIN response to 192.228.XX.0/24 for ca IN A  (0000047e)
11-Feb-2013 12:14:10.755 queries: info: client 192.228.XX.YY#17531 (thisisatestbyjake9.ca): drop NXDOMAIN response to 192.228.XX.0/24 for ca IN A  (0000047e)

So...7 out of 10 queries responded to, says dnsperf and the BIND query log.

That 7/10 figure is very consistent.  What is not consistent is which of the responses are dropped.

This appears due to the fact that these queries, sent at a very high rate, do not always arrive in the order they were sent, and thus, nor do their responses.

The numbers are very much improved when a larger amount of queries are sent.  In this example, I send 100 queries.

Using "dnsperf" to send 100 queries that would all generate the same NXDOMAIN response:
  Percentage completed:  52.00%
  Percentage lost:       48.00%

So...52 out of 100 queries responded to, says dnsperf and the BIND query log.  This figure is also very consistent.

When I send 1000 queries...it takes 11 seconds.  So... ~90 queries/second.
  Percentage completed:  50.20%
  Percentage lost:       49.80%

This means that I've responded to ~45 queries/second.

This begs the question, then...

Why did I respond to 45 queries/second when I'm configured to do:
        responses-per-second 5;

-Jake


More information about the ratelimits mailing list