<font size=2 face="sans-serif">        We implemented

RPZ with a purchased feed about a month ago on our production DNS servers.

  As expected from our testing and pilot there were a few immediate

issues which we have taken care of.   However, we are still getting

a trickle of complaints about slowness and failures that appear to be related

to the RPZ and the amount of time it takes to complete all the extra queries

for the NSDNAME checks.   When we research these issues they seem

to fit into 2 groups.</font><font size=3> <br>

<br>

</font><font size=2 face="sans-serif"><br>

        1.   DNS zones with "slightly"

broken infrastructure.  These would be domains with either slow response

from one or more name servers or not responding name servers.  A recursive

resolver without a RPZ loaded can work though the issues and provide a

timely response to the client.  However, the extra lookups required,

primarily for the NSDNAME checks, amplify what would be a "minor"

DNS issue and increases the query time to the point where DNS times out

from the client perspective.    I can't really see a fix here,

 the issue does reside with the domain owner, we are simply more susceptible

to the issue because of the RPZ's.  </font><font size=3> <br>

</font><font size=2 face="sans-serif"><br>

        2.  DNS zones with a large number of NS

records and the name servers have FQDN's in several different DNS zones.</font><font size=3>

I found some where the 2nd and 3rd level domains have a different list

of NS records in various unrelated domains.  These have primarily

been non business related sites that I don't care about, however, here

is a simple real world example: <br>

</font><font size=3 color=blue><br>

</font><font size=3>;; QUESTION SECTION:</font>

<br><font size=3>;banque-france.org.          

  IN      NS</font>

<br>

<br><font size=3>;; ANSWER SECTION:</font>

<br><font size=3>banque-france.org.      600    

IN      NS      indom80.indomco.hk.</font>

<br><font size=3>banque-france.org.      600    

IN      NS      indom30.indomco.fr.</font>

<br><font size=3>banque-france.org.      600    

IN      NS      indom20.indomco.net.</font>

<br><font size=3>banque-france.org.      600    

IN      NS      indom10.indomco.com.<br>

</font><font size=2 face="sans-serif"><br>

        These are the most frustrating as there is

really nothing wrong with this setup in my opinion.   This, by design,

is just going to generate a large number of DNS lookups to do a full NSDNAME

check.  These are hard to explain away as they "work from home"

and "work from my phone".  These are also difficult as they

are region specific.  For example:</font>

<br>

<br><font size=2 face="sans-serif">These times are from recursive resolvers,

physically located around the world, setup with root hints only, a empty

cache, and a RPZ loaded that includes a NSDNAME check.  </font><a href=mailto:dnsfirewalls@lists.redbarn.org></a><a href=mailto:dnsfirewalls@lists.redbarn.org></a><font size=3>I

ask each of them for </font><a href="www.banque-france.org"><font size=3>www.banque-france.org</font></a><font size=3>.

  This lookup requires ~30 individual DNS lookups to complete the

NSDNAME checks.</font>

<br>

<br><font size=2 face="sans-serif">         

      no RPZ        RPZ</font>

<br><font size=3>Europe            

   20ms        70ms</font>

<br><font size=3>US              

 30ms        350ms</font>

<br><font size=3>China            

   90ms        900ms</font>

<br><font size=3>Australia        110ms  

     1400ms</font>

<br>

<br><font size=2 face="sans-serif">        I

understand the queries and latency amplification behind these times.  But

due to poorly written web applications, anycast \ load balanced DNS servers

that do not share a cache, and generally short TTLs on nearly every hop

in this particular lookup, it takes a web site that is very usable globally

before NSDNAME checks to one that is only usable in Europe.   </font>

<br>

<br><font size=2 face="sans-serif">        Have others

found similar issues when implementing RPZ's?</font>

<br><font size=3>     What have you done to mitigate them?</font>

<br><font size=3>     Is there a RPZ log event that says

"It took over (X) seconds to complete this query because of RPZ"?

 Basically, I got a good answer back for the 'real' query but I did

not provide it to the client within X seconds because the RPZ check was

still ongoing.  I can imagine there would be a huge amount of noise

in those messages but they could conceivably be acted on before the client

calls with an issue.<br>

<br>

<br>

</font>

<br>

<br>

<br><font size=5 color=blue><b>David A. Evans</b></font>

<br><font size=3><b>Enterprise IP/DNS Management</b></font>

<br><font size=3><b>Network Infrastructure Tools and Services</b></font>

<br>