[ratelimits] Remarks regarding the Knot DNS 1.2.0 RRL implementation

Mon Mar 4 14:40:57 UTC 2013

> From: =?UTF-8?Q?Marek_Vavru=C5=A1a?= <marek.vavrusa at nic.cz>

> I believe you store qname hash in the buckets (+other data of course),
> so it's still
> possible to have a collision, it's just that you use more bits
> (qtype,rtype etc. stored separately)
> and allow buckets with different netblocks under a same hash.

Of course the BIND9 RRL hash function has collisions; collisions are
fundamental to hashing.  What can differ is what happens when there
is a collision.  What the BIND9 RRL code does not have is collisions
between different client IP addresses on the same bucket of counters,
timers, etc.  In the BIND9 RRL code, the hash function is used to find
a chain of buckets of counters, timers, etc.  That chain is searched
for the bucket with the right (QTYPE,QNAME,client IP).  I understand
your code and the NSD code to have a single bucket of counters, timers,
etc. for all (QTYPE,QNAME,client IP) tuples with a given hash value.

I think the cost of chaining to prevent collisions on counters, timers,
etc. is necessary, because the number of legitimate clients at which
the probability that at least 2 clients at different IP addresses will
collide on the same bucket of counters, timers, etc. without chaining
is approximately the square root of the number of buckets.  If you
have 1,000,000 buckets, then the probability is more than 0.5 that at
least 2 clients among 1000 IP addresses making the same request will
use the same bucket.  See https://www.google.com/search?q=birthday+paradox

At authoritative servers where RRL makes sense, different clients
are very likely to request the same (QTYPE,QNAME).

I understand that the NSD RRL code deals with those collisions on
counters by recommended very much larger thresholds for dropping
responses than the thresholds of 10 or even lower recommended for
the BIND9 code.  As long as the bad guys do not distribute
reflection attacks, thresholds of 100 or 1000 are ok.

>                                         is there any finite limit as
> to how many buckets will it allocate?

There are both a minimum or initial number of buckets and a maximum
number of buckets.  When the maximum would be exceeded, the oldest
bucket is recycled.  Unless the number of attacking IP addresses is
larger than the number of buckets, it is safe to recycle the oldest
bucket.

> this is not trivial nor time efficient and from my point of view, not
> worth the effort. Or am I missing something?

I'm not sure if we're talking about collisions.  If so, I think you
are missing the effects of the Birthday Paradox that make the
probability of collision between legitimate clients high.

You might notice that I wrote about collisions between clients instead
of collisions between requests.  That is because each bucket of counters,
timers, etc. in the BIND9 RRL code has a key consisting of
(IP,class,QTYPE,hash(QNAME),ok/NXDOMAIN/ERROR)
Using a 4 byte hash of the QNAME instead of variable space or a fixed
space of 255 bytes for the QNAME lets the buckets be a fixed, modest
size.  As long as a single DNS client fewer than than several thousand
requests/second, it is unlikely to suffer a false positive.  Besides,
a client making 1000s of requests/second for different QNAMEs probably
should be rate limited.

Vernon Schryver    vjs at rhyolite.com