[ratelimits] Remarks regarding the Knot DNS 1.2.0 RRL implementation

Tue Mar 5 18:43:39 UTC 2013

On 5 March 2013 18:41, Vernon Schryver <vjs at rhyolite.com> wrote:
>> From: =?UTF-8?Q?Marek_Vavru=C5=A1a?= <marek.vavrusa at nic.cz>
>
>>                      By the way, did you evaluate properties of BIND9
>> chosen hash function?
>
> Which hash function, the main hash function for finding buckets or the
> hash for compressing qnames?  The main hash function is a classic
> division hash.  I like multiplicative hashes because they work well
> with varying hash table sizes, but in various applications I've found
> division hashes.  I'm proud of the lazy evaluation that makes growing
> the hash table very cheap.  It uses the fact that most buckets are
> touched only once and any bucket untouched for a second is garbage.

This one. Fair enough, didn't see the trick behind this.
I also quite like how you calculate divisor instead of just table size, clever.

> For compressing qnames, I use with one of the domain name hashing
> functions already present in BIND9 that fit the need for compressing
> the qname.  That existing hash function looks sub-optimal to me, but
> obvious alternatives had more collisions on domain names from real
> queries.
>
>
>> > At authoritative servers where RRL makes sense, different clients
>> > are very likely to request the same (QTYPE,QNAME).
>>
>> Ok, then 2 hashes among sqrt(num_buckets) will collide at p~0.5.
>> That is not the same, as the hash is calculated from the (IP,class,qname,seed),
>> you can't just narrow it down to "different IPs", just that at least 1
>> thing from the tuple changed.
>
> For estimating collision rates, the seed, class, and qtype have a
> total of less than 3 bits of entropy, and so must be ignored along
> with the constants in the hash function.  At authoritative servers,
> the qname also limited entropy.  At a server authoritative for 1000
> domains, the entropy of the qname is only 10 bits.

That is a good point - how much does it affect the resulting hash?
I'm not familiar with the "3 bit" threshold for ignoring it, but it is
a good point - I'm worried
about class information content mostly. I'll do some measurements later on.

>> IF that happens and two different IPs with the same (qtype,qname) will
>> hit the same bucket,
>> they take two tokens from it. If they issue the same request over and
>> over, well then, they should be rate limited. But we're playing nice
>> and if second IP will cause the collision -> part of the tokens will
>> be restored (but not multiple times per window), so it will still get
>> some kind of service.
>
> I do not understand.  If you have a threshold of X and if two clients
> at 192.168.0.1 and 10.2.3.4 collide in a bucket, then you will drop
> responses when their combined rate is X.  Why should they be rate
> limited at X/2?  Or worse when the other IP address is closer to X?

No, maybe I explained it wrong. Let's say the bucket is assigned to 192.168.0.1,
it has a remaining rate X for this time window. Now a 10.2.3.4 hits
the same bucket,
collision is detected, bucket is reassigned and marked and an extra
portion of the rate is given to a bucket. Now a 192.168.0.1 hits the
bucket which now belongs to a different IP, but it has already
collided
in this time window (1sec), so the bucket won't be reassigned this
time. Next second, it can reassign again. All in all, it just limits
the number of rate resets in a bucket and doesn't reset rate to a full
capacity.

>> IF they exhaust all the tokens together, then there's SLIP. I don't
>> see a point of exploiting this, if you want to deny most of the
>> service to victim, then you just spoof the address and that's it.
>> It may happen with a legitimate traffic yes, but will it make a
>> difference? Not much.
>
> As in the other thread, it is important to only drop bad packets.
> It's easy to detect and mitigate evil; just turn off the server.
> It's hard to detect and mitigate only evil without affecting good
> packets.

This is the ideal, but cannot be achieved as you have no way of
knowing which request is legitimate
in a flow of forged ones. But I agree that we should try hard to keep
the number of forged requests
to an absolute minimum.

>
>
>
>> > the BIND9 code.  As long as the bad guys do not distribute
>> > reflection attacks, thresholds of 100 or 1000 are ok.
>>
>>This worries me a lot. RRL will work "ok", but only until the bad guys realize,
>> that they can use multiple servers at once for the attack.
>> And it's not something we could fix at this level.
>
> By checking the full (IP,qtype,qname) and using chaining, you could
> fix that problem for attacks distributed to fewer than 10,000 or
> perhaps 100,000 reflectors depending on TTLs.

Not sure if we mean the same "distributed" attack.
I mean - when a bad guy sends spoofed queries to a large number of
authoritative servers, to each with a very low rate, but the combined
responses flood the victim. This is not something that chaining could
fix but requires a proper solution on the lower level.

>
>
> Vernon Schryver    vjs at rhyolite.com
> _______________________________________________
> ratelimits mailing list
> ratelimits at lists.redbarn.org
> http://lists.redbarn.org/mailman/listinfo/ratelimits