<font size=2 face="sans-serif"> We implemented
RPZ with a purchased feed about a month ago on our production DNS servers.
As expected from our testing and pilot there were a few immediate
issues which we have taken care of. However, we are still getting
a trickle of complaints about slowness and failures that appear to be related
to the RPZ and the amount of time it takes to complete all the extra queries
for the NSDNAME checks. When we research these issues they seem
to fit into 2 groups.</font><font size=3> <br>
<br>
</font><font size=2 face="sans-serif"><br>
1. DNS zones with "slightly"
broken infrastructure. These would be domains with either slow response
from one or more name servers or not responding name servers. A recursive
resolver without a RPZ loaded can work though the issues and provide a
timely response to the client. However, the extra lookups required,
primarily for the NSDNAME checks, amplify what would be a "minor"
DNS issue and increases the query time to the point where DNS times out
from the client perspective. I can't really see a fix here,
the issue does reside with the domain owner, we are simply more susceptible
to the issue because of the RPZ's. </font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
2. DNS zones with a large number of NS
records and the name servers have FQDN's in several different DNS zones.</font><font size=3>
I found some where the 2nd and 3rd level domains have a different list
of NS records in various unrelated domains. These have primarily
been non business related sites that I don't care about, however, here
is a simple real world example: <br>
</font><font size=3 color=blue><br>
</font><font size=3>;; QUESTION SECTION:</font>
<br><font size=3>;banque-france.org.
IN NS</font>
<br>
<br><font size=3>;; ANSWER SECTION:</font>
<br><font size=3>banque-france.org. 600
IN NS indom80.indomco.hk.</font>
<br><font size=3>banque-france.org. 600
IN NS indom30.indomco.fr.</font>
<br><font size=3>banque-france.org. 600
IN NS indom20.indomco.net.</font>
<br><font size=3>banque-france.org. 600
IN NS indom10.indomco.com.<br>
</font><font size=2 face="sans-serif"><br>
These are the most frustrating as there is
really nothing wrong with this setup in my opinion. This, by design,
is just going to generate a large number of DNS lookups to do a full NSDNAME
check. These are hard to explain away as they "work from home"
and "work from my phone". These are also difficult as they
are region specific. For example:</font>
<br>
<br><font size=2 face="sans-serif">These times are from recursive resolvers,
physically located around the world, setup with root hints only, a empty
cache, and a RPZ loaded that includes a NSDNAME check. </font><a href=mailto:dnsfirewalls@lists.redbarn.org></a><a href=mailto:dnsfirewalls@lists.redbarn.org></a><font size=3>I
ask each of them for </font><a href="www.banque-france.org"><font size=3>www.banque-france.org</font></a><font size=3>.
This lookup requires ~30 individual DNS lookups to complete the
NSDNAME checks.</font>
<br>
<br><font size=2 face="sans-serif">
no RPZ RPZ</font>
<br><font size=3>Europe
20ms 70ms</font>
<br><font size=3>US
30ms 350ms</font>
<br><font size=3>China
90ms 900ms</font>
<br><font size=3>Australia 110ms
1400ms</font>
<br>
<br><font size=2 face="sans-serif"> I
understand the queries and latency amplification behind these times. But
due to poorly written web applications, anycast \ load balanced DNS servers
that do not share a cache, and generally short TTLs on nearly every hop
in this particular lookup, it takes a web site that is very usable globally
before NSDNAME checks to one that is only usable in Europe. </font>
<br>
<br><font size=2 face="sans-serif"> Have others
found similar issues when implementing RPZ's?</font>
<br><font size=3> What have you done to mitigate them?</font>
<br><font size=3> Is there a RPZ log event that says
"It took over (X) seconds to complete this query because of RPZ"?
Basically, I got a good answer back for the 'real' query but I did
not provide it to the client within X seconds because the RPZ check was
still ongoing. I can imagine there would be a huge amount of noise
in those messages but they could conceivably be acted on before the client
calls with an issue.<br>
<br>
<br>
</font>
<br>
<br>
<br><font size=5 color=blue><b>David A. Evans</b></font>
<br><font size=3><b>Enterprise IP/DNS Management</b></font>
<br><font size=3><b>Network Infrastructure Tools and Services</b></font>
<br>