[dnstap] suggested optional fields for DNSTAP

Fri Feb 27 22:52:08 UTC 2015

Joseph Gersch wrote:
> Hello all,
> 
>    I would like to suggest two optional fields for the DNSTAP schema.
> 
>    The first one has already been discussed, but I don’t see it in the schema yet:  a boolean for CACHE-HIT/CACHE-MISS.

Hi, Joe:

Yes, adding a new field to indicate cache status is pretty easy.  I held
back from adding it to the schema just yet because I wanted to get
feedback from implementers / operators on whether it should be a boolean
vs. an enum.

In the protobuf wire encoding scheme, booleans and enums use the same
"variable-length integer" primitive:

    https://developers.google.com/protocol-buffers/docs/encoding#structure

So from an efficiency perspective, there's no downside to upgrading the
field to an enum and encoding a "richer" set of information; they
consume the same number of bytes on the wire.  In particular, I notice
Google says that their Public DNS service generates the following in
their "permanent logs":

    https://developers.google.com/speed/public-dns/privacy

    [...]
    Finally, if you're interested in knowing what else we log when you
    use Google Public DNS, here is the full list of items that are
    included in our permanent logs:
        [...]
        - Whether the request hit our frontend cache
        - Whether the request hit a cache elsewhere in the system (but
          not in the frontend)
        [...]

So, that's two different types of cache hits.

I know that Unbound has both a "message cache" and an "RRset cache", and
IIRC PowerDNS Recursor also has a "packet cache" which I believe may be
equivalent to Unbound's message cache.  (I think BIND 10 also
implemented, or at least discussed implementing, a PowerDNS-style packet
cache, too.)  So maybe it makes sense to distinguish between different
types of cache hits, if there's some consensus between implementations
on cache types.

What do you think about adding a new enum type to the dnstap.proto
definitions:

    enum CacheStatus {
        // The response message could not be generated entirely from
        // cached information.
        CACHE_MISS = 1;

        // The response message was generated by consulting a
        // whole-message cache of DNS responses.
        CACHE_HIT_MESSAGE = 2;

        // The response message was generated entirely by consulting a
        // cache of DNS records.
        CACHE_HIT_RECORD = 3;
    }

and then adding a new optional field of that type to the "Message" type:

    message Message {
        // [...elided...]

        optional CacheStatus        cache_status = 15;
    }

This would allow extending CacheStatus to new types of cache hits in the
future, too.

>    The second one is to generate  a unique GUID  for and store it for each CLIENT_QUERY.  This GUID would also be stored with each RESOLVER_QUERY and RESOLVER_RESPONSE.   This would allow an analysis of a DNS TRACE to determine operational issues with long recursive resolutions.  It is insufficient to just have bailiwick or domain name, because once the recursive resolution starts chasing a CNAME or chain of NS delegations, the domain name changes.  Some recursions can take 10-70 lookups to get full resolution.  Having a GUID to tie them all together would be very useful.

This is a great use case, but it sounds like it might be a bit hard to
implement, at least in the recursive DNS server I'm most familiar with
(Unbound).

Aside: GUIDs/UUIDs specifically would probably be a bit problematic
compared to just a simple 128-bit random nonce from a seeded CSPRNG
(which would be more convenient, since recursive DNS servers already
have those built-in to securely generate random source ports and IDs),
but that's a relatively minor issue.

The big problem, though, is that a single "query tag" is probably not
enough in the general case.  IIUC, the mitigation for VU#457875
("Various DNS service implementations generate multiple simultaneous
queries for the same resource record") [0] in most recursive DNS
implementations was to aggressively combine outbound queries for the
same record.

That is, there may not be a mapping only from

    RESOLVER_RESPONSE -> CLIENT_QUERY

it's potentially a mapping to a set of multiple unrelated outstanding
CLIENT_QUERYs:

    RESOLVER_RESPONSE -> { CLIENT_QUERY, CLIENT_QUERY, CLIENT_QUERY }

I'm not sure if there are limits imposed on how large that set of
CLIENT_QUERYs might be in recursive DNS implementations, so, in the
general case, we might need to store a very large vector of query tags
in the RESOLVER_RESPONSE, and we'd generally prefer to have a hard bound
on the maximum size of a dnstap protobuf payload.  (E.g., given a
maximum 64K DNS message size plus more than enough room for dnstap
metadata, it might be reasonable to bound the overall dnstap payload
length to around ~70 KB, and certainly no more than 128K.  Protobuf
implementations generally decode the entire payload at once into a
complete in-memory object, so for security reasons it's important to
bound the size of a protobuf message.  Though there are some protobuf
implementations that can do incremental parsing.)

The more practical reason to avoid something like this is that it
probably requires some intrusive changes to existing recursive DNS
implementations.  (For the Unbound dnstap work, I tried to minimize the
intrusiveness of the changes required in order to increase the
likelihood of having the changes accepted by the upstream developer.)

IIUC, Unbound in particular has a fairly opaque callback-based interface
between what it calls the "serviced query", and the original client
query; search for "service_callback" in Unbound's
services/outside_network.h file for details.  I'm afraid that would mean
making some invasive modifications to Unbound's internal API in order to
gather the "query tag" vector when serializing RESOLVER_RESPONSEs.

Now, it's true that one could map RESOLVER_*QUERY* payloads (rather than
RESOLVER_RESPONSEs) to exactly the one CLIENT_QUERY that set it off in
the first place (modulo trivial corner cases like pre-fetching), but
that solution is a little incomplete without tagging RESOLVER_RESPONSES
with the full CLIENT_QUERY tag vector; CLIENT_QUERYs that get ganged on
to an already open upstream fetch will possibly have "dangling" query
tags (i.e., no corresponding RESOLVER_* payloads with the same tag, or
an incomplete set of payloads).  And those are probably the queries
you're most interested in analyzing, too.

That is, summing up, I think this is a very nice use case, but it's
probably pretty hard to implement a really satisfying general solution.
I'd be happy to be proven wrong, though!

BTW, have you looked at Casey Deccio's dnsviz tool [1] ?  Not the
dnsviz.net service, but the Python tool that drives it.  I wonder if
there might be an alternate approach to the problem of analyzing the
root-cause of resolution failures that involves active probing, that
could be triggered by a hypothetical new "timeout" dnstap payload.
(This is a lot easier, we just need to enumerate and define the
different kinds of timeouts that can occur.)

[0] http://www.kb.cert.org/vuls/id/457875

[1] https://github.com/dnsviz/dnsviz

-- 
Robert Edmonds