[ratelimits] rate limiting recursive server

Fri May 10 17:01:57 UTC 2013

Bob Harold and I have talked in private, and he suggested that our
conclusions belong in the mailing list.

An open issue is mentioned at the end.

> To: rharolde at umich.edu

> >  Here is a test I ran, and perhaps you could run it also

> I think I now understand.
> I now also seem to recall talking about this scenario before.
> Your test does not involve setting the RD or "recursion desired"
> to 0, but is a burst ordinary stub resolver RD=1 requests for a
> bunch of subdomains of hp.com.
> The recursive resolver's cache is empty, and each request requires the
> recursive resolver to send its own request to an authority for hp.com.
> Requests that require recursion are rate limited before the recursion
> and logged as if RD=0.
>
> That protects the recursive resolver from begin bogged down trying
> to send a zillion recursive requests <random>.example.com, wait
> for answers, retry, etc.
> and from itself being rate limited by the authority for example.com.
>
> The RRL log message text misleadingly talks about rate limiting
> referrals instead of or in addition to recursive requests made by
> the recursive resolver.
>
> The next version of the RRL patch will have a separate referrals_per_second
> setting that could be set high, but I think using it for this case
> would be wrong.
> I don't think there is a problem here.  A browser (or pack of
> browsers in a /24) that sends a burst of 50 requests for a1.hp.com,
> a2.hp.com, a3.hp.com, ... a50.hp.com is naughty because it demands
> that the recusive resolver create 50 threads to recursively resolve
> the 50 names.  If the recursive resolver does not discard some of
> the requests early, then it will probably discard them later as it
> runs out of threads to handle recursion (or tries to avoid devoting
> all of its threads to one client).
>
> The best thing that can happen for whole system is that the browser's
> excess requests are discarded and that the browser times-out and retries.

(I didn't mean "threads" in the POSIX sense, but the state that a
recursive resolver must maintain while recursing for a query.)

} To: rharolde at umich.edu

} > Thanks for explaining.  I think that discarding requests, and having the
} > browser retry, only increases the load on my DNS server and slows down a
} > valid client.  I cannot control how web pages are designed, so if my DNS
} > server cannot handle this type of load, then I need a better DNS server.
}
} My theory differs and says that the increased load is less significant
} than letting a single client use too much of a recursive resolver's
} recursive powers,
} and that thanks to DNS caches, delays will not be noticed by users.
} Experiments are needed to see which theory is more accurate.
}
}
} > I understand that attackers could create <random>.domain.com queries, but
} > most of the current attacks don't, so I would like to block the current
} > attacks more, and block random attacks less, because they look too much
} > like valid traffic, and thus limiting them would have too many false
} > positives.
}
} I think dealing with future attacks is at least as important as
} dealing with past attacks.
}
}
} > If I set referrals_per_second high, to let valid clients through, but the
} > random attack requests all get the same (nxdomain) response, will those get
} > limited at a lower rate?  That really is what I would want.

} The current code considers rate limiting each request only once.
} That makes sense for an autoritative server, but might be a bug
} on a recursive server.  Perhaps a recursive server should count
} a request once if it provokes one or more recursive requests and
} a separately, a second time when it produces a response.
} I need to think about it and talk to others.

Opinions about that last paragraph are needed.

Vernon Schryver    vjs at rhyolite.com