[ratelimits] Remarks regarding the Knot DNS 1.2.0 RRL implementation
vjs at rhyolite.com
Wed Mar 6 16:23:40 UTC 2013
> From: =?UTF-8?Q?Marek_Vavru=C5=A1a?= <marek.vavrusa at nic.cz>
> But back to the original scenario, I have made a quick test to measure
> number of collisions using the exact same query (QNAME=test.rrl.) from
> N=100 000 random IPs (uniform dist.) and checked the number of
> collisions. N is chosen as an expected response rate, since collision
> in a >=1sec. inactive bin isn't really a collision just bucket reuse.
> There were ~3400 collisions in 100 000 different IPs (SD~52.2, 1000
> test runs). That is about p=0.034 which is reasonable for me
> (distribution attached).
> Yes, there can be other cases, but if they are unpredictable and seed
> changes over time,
Some random comments of no particular importance:
- IP addresses are not uniformly distributed in [0,4294967295]
so a uniform selection from [0,4294967295] is not entirely
- That the seed varies and so the collisions are not reproducable
is a problem. Unreproducable failures are the worst kind.
- You don't mention the size of the hash table. Collisions in good
hash schemes depend strongly on load factor.
- You didn't mention the query rate modeled by 100,000 random IPs.
- Instead of a probabilty of collision for 100,000 IPs, we need
the false positive or negative probability. I think that is
related to the expected value of the number of collisions per
second given a query rate of Q and a hash table size of N.
> we would be willing to
> implement the internal chaining you proposed, similar to open
> addressing. It is a reasonable compromise, yes.
I honestly do not understand why any RRL implementation does not use
some sort of chaining. It both saves memory and prevents failures.
Vernon Schryver vjs at rhyolite.com
More information about the ratelimits