[ratelimits] Why not use chaining?

Vernon Schryver vjs at rhyolite.com
Tue Mar 5 23:41:54 UTC 2013


I've seen neither the Knot nor the NSD RRL code, but I really wonder
why neither uses keys and chaining.


Consider that the NSD RRL code is said to use a fixed size hash
table of default 1,000,000 entries in 20 MByte.

If each entry were increased to contain 4 bytes of compressed (hashed)
qname and four 20-bit hash table indeces, then the code could easily
use hashing with internal chaining.  That would only increase the
default size of the table to 34 MBytes.

However, by using chaining, you could significantly reduce the number
of entries, because collisions would not matter except to lookup speed.
For a target query rate of 100K qps and the familiar hash table load
factor of 0.9 to get an average lookup rate of 1.1 or about 10%
collisions, you could use a table of only 110K or 120 entries or less
than 4 MBytes.  That is only 20% of the current default 20 MBytes and
it escapes all of the the false positive or false negative problems
of collisions in the current implementations.

Two of the 20-bit indeces would be used to double link entries into a
hash chain.  The other two 20-bit indeces would be used for an LRU
list of entries.  It would use 20-bit indeces instead of pointers to
save space, especially on 64-bit systems where pointers want 64 bits.

I would be happy to outline a suitable variation of the internal
chaining hashing I've used in many applications including a commercial
database decades ago and DCC in this century.  It's simple and only
about as messy as the BIND9 scheme.  It would differ from the BIND9
RRL code, which uses external chaining.  The BIND9 RRL code is
(perhaps prematurely and unnecessarily) optimized to dynamically
size its hash table instead of using a fixed size as in the NSD RRL
code.  I think internal chaining would require temporarily almost
doubling the memory footprint during expansion.  I can't guarantee
the idea is free of patent trolls, but the first time I saw and
used the idea was in the late 1970's.


Vernon Schryver    vjs at rhyolite.com


More information about the ratelimits mailing list