-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbldnsd returns NXDOMAIN for ENTs (Empty Non-Terminals) #17
Comments
Hello, thanks for the feedback. RBLDNSD was written in the last century, qname minimization was not yet existing so the whole code doesn't even take this chance into account. There is also quite a lot of consensus in the SMTP World that qname minimization shouldn't be used on the resolvers used by mail servers. See http://postfix.1071664.n5.nabble.com/qname-minimization-and-privacy-breaks-dnsbl-in-postfix-tt103456.html#a103458 for an example. A change to support this would require some efforts and we'd really like to learn about real world cases where this behaviour has caused issues. |
Going a bit deeper from the technical point of view, the dnset Dataset stores the data in such a way that implementing the requested feature impossible to implement. So, for domain names, if this feature is requested, a completely new dataset should be implemented. For the IP(v4 and v6) datasets, all of them, we could implement a hackish solution so that when a query for a "partial" ip address is received, rbldnsd doesn't reply NXDOMAIN but NOERROR instead. For example:
Opinions ? |
Please do: I see no downsides to this and an incomplete solution will still be better than being totally broken. I also think that lack of support for ENTs for some data types should be well documented, because it is and has always been a bug: IIRC the first time that returning NXDOMAIN for ENTs was widely discussed as being broken was at the time of DNSSEC standardization work. |
I'm digging a bit more into the DN dataset. Another hackish solution would be that the DN dataset always returns NOERROR for every query.
This would most probably break caching for NXDOMAIN entities, though (and this consideration applies also for the IP datasets with the hack applied). This solution would probably be too aggressive. Opinions are welcomed on this topic as well. Regarding the code: should this feature be always enabled or should it be enabled by a configure option ? |
I am reviewing RFC7816 and currently pondering its ramifications. I'll probably have a better idea what I would want to do later after a bit of testing. Currently, I have concerns that returning NODATA (NOERROR / ANSWER: 0) would result in client implementations as non-listing (generating false-negitives). IIRC, years ago there was discussion that dnsbl-clients should treat NODATA (NOERROR / ANSWER: 0) as a non-listing. Again, I need to review and better understand the RFC to see where edge cases may exist for clients. |
Unless this is subject to serious testing then I believe that the evil we know (NXDOMAIN for ENTs) is better than trying something new like unexpected NODATA. But I still suggest that you fix NODATA handling for the IP-related datasets, for which it should be easy and a correct solution. |
Informational; Way back in 2006 one of the moderators of NANA.Blacklisting outlined how DNSBL clients should respond to "Empty non-error" responses. https://groups.google.com/g/news.admin.net-abuse.blocklisting/c/UIYFltOT4mA/m/59sRQw4UqwoJ Also it appears to have neither RFC5782 nor RFC6471 outline specifically how to handle empty non-error responses. However, RFC8904 does suggest that returning empty non-error responses should be considered a non-listing; The result of the method states how the query did, up to the interpretation of the returned data. |
Given that the rbldns-client passes the full QNAME to the resolver, and the resolver MUST respond with that full QNAME back to the client, with any QNAME minimization schemes (RFC7816 is categorized as "Experimental") being done by the resolver before responding to the client, this is unlikely to have an impact on the clients. |
(snip)
Actually, I would believe that these entries would be cached properly. The resolver has an answer, and the answer is non-error. I would believe that resolvers would cache these replies just like any other replies.
I would suggest that because RFC7816 is listed as experimental that this be presented as a command line flag to always return "empty non-error" instead of NXDOMAIN. |
I have pushed an implementation of thr qname minimization feature to the qname-minimization branch. First of all you have to compile rbldnsd with the The qname minimization behaviour is activated ONLY for specific datasets. As an example, this entry
would enable a specific dataset to show the qname minimization behaviour. This feature would need extensive testing so help and feedbacks from the community would be much appreciated. |
The documentation is not clear:
And what is the rationale for making this a compile-time option? Would building rbldnsd with I do not think that the DNSBL RFCs need to specifically mention how NOERROR answers should be treated, because there are no deviations from the usual DNS standards and behaviours. I have been pondering this a bit and now I agree with @dennywatson: since a NOERROR answer would be cached using the same rules of a NXDOMAIN answer then there will be no caching lifetime changes for "correct" queries. The only change that I can see when answering NOERROR instead of NXDOMIN for all non-listings is that a resolver receiving a NOERROR for faketld.dnsbldomain.net would still send a query for domain.faketld.dnsbldomain.net instead of correctly deducing than no subdomains exist. I do not know if this actually would have a practical impact, but it should be easy to measure for Spamhaus by turning on and off the feature for a while. If no bad effects due to caching are measured then I even think that correct support of ENTs should be enabled by default because not breaking name servers implementing QNAME minimization (which is probably soon going to be "all of them") is much more important than not breaking already broken clients of which we are not even sure if any exists. NOERROR answers should definitely be always turned on for queries to IP datasets, because we know exactly which queries should return NXDOMAIN and which ones NOERROR and because legitimate clients are not supposed to query for incomplete IPs, so there can be no concern about incorrect handing of NOERROR answers. (Hi @dennywatson, you may remember me from the Brussels or Dublin M3AAWG...) |
(parts snipped and reordered)
Hi, and yes. Also, you and I are aware of each other in other forums -- going back decades.
Perhaps. I would also like to avoid RFC7816, as I feel it is poorly written. In an attempt for advocacy I feel that it appears to suffer from some logical facilities, glosses over some potential problems, purports to solve more than it actually does, and has potential misunderstandings of what was written in RFCs 1034 and 1035. Reading it with a critical eye, I'm not a fan. I don't have the time to dissect RFC7816 to reveal all of its potential flaws. Having said that, yes ENTs probably shouldn't respond NXDOMAIN.
(reordered)
In a past life, I had maintained a qmail install, and can think of one example... Though this an unusual one that most likely suffers from a host of other issues. DJB chains could be constructed where that DNSBLs are queried and qmail's rblsmtpd triggered based on the setting of RBLSMTPD variable. Decades ago there was a dns package called firedns (might still exist, I haven't bothered to look) and its command line client would exit non-zero on NXDOMAIN. One could set this up with a string of logical ands resulting in the combination of a listing in two or more positive listings being required for actual blocking of email. This strategy could suffer from other problems such as wildcarding because a domainer has bought the domain, and/or SERVERROR problems, but it is an example of how someone may have implemented a dnsbl in such a way as there might exist problems. !!! Implementation of this feature is a policy decision that should be expressed to its userbase !!! (reordered)
Increased protocol traffic, and reduced response time to the client. Tested against Unbound. I am somewhat concerned that unbound doesn't appear to cache empty no-error in any way and appears to always wants to traverse the full path when it sees empty non-error. I would need to take a critical look at 1034 and 1035 to determine if this behavior is broken. My gut says that, "You have received an authoritative non-error answer, cache that! If you are going to implement an experimental RFC -- then you need to add code to accommodate what you are doing," again; I need to review 1034 and 1035. Over-query would appear to always be the case for NXDOMAIN.
I'm not opposed to adding it into default (perhaps at a later date) as the behavior is controlled by the zonefile.
I see this as more of a known query width issue, and rework of the existing data structures to accommodate searching that structure. Yes, for an IP based either IPv4 or IPv6 this is a known width. For domainnames, less so. Overall, I have opinions. These are only opinions, and they are only mine; RFC7816;
Unbound;
Debian;
|
One condition that I neglected to point out. Against a stock build of rbldnsd; After receiving its first NXDOMAIN Unbound appears to then query the full QNAME against the last NS it is working with. I.e. there exists the possibility for significant query reduction for IPv6 datasets. |
The patch has been reviewed:
Notably:
|
Hi all. Sorry for my tardy reply. It might be perhaps useful to provide more information on the setup where I first bumped into this: Infrastructures querying public DNS resolvers usually quickly exceed query rate limits on common DNSBLs - if not even put off with a "you are querying our DNSBL via a public resolver" answer entirely. In such cases, I frequently observed DNS forwarding setups for common RBL zones, directed against their nameservers directly. This way, rate limits or policy-based decisions on public resolvers can be relatively reliably avoided. This was the environment where I noticed a script of mine apparently worked with an URIBL, but never blocked anything, despite conducting the sanity checks mentioned in RFC 5782, section 5. Since I was completely unaware of (strict) QNAME minimisation on the infrastructures' resolver, it took quite a while to figure things out. While I certainly appreciate the patch by @ammammita, the URIBL sanity test(s) in RFC 5782 should be changed to a more realistic query (perhaps for It would be nice to see selective QNAME minimisation settings possible in Unbound, by the way. At the moment, they have not implemented that, and it's probably hard to change.
Agreed, and it unfortunately looks like my code is interpreting empty
Absolutely. This is something I did not have in mind, and unfortunately, I cannot think of an elegant, regression-free solution to this. |
On 4/1/2021 2:47 PM, twesterhever wrote:
Hi all. Sorry for my tardy reply.
It might be perhaps useful to provide more information on the setup
where I first bumped into this: Infrastructures querying public DNS
resolvers usually quickly exceed query rate limits on common DNSBLs - if
not even put off with a "you are querying our DNSBL via a public
resolver" answer entirely.
I believe that this touches on one of RFC7816 (QNAME minimisation)
biggest problems, it does not address the wishes and/or policies of the
site you are pushing DNS queries to. I have questions regarding the RFC
in the matter, and testing with Unbound's implementation suggests the
RFC7816 _needs_ an errata document published. Such errata should cover
caching on empty responses (I.e. RFC2308 _must_ be implimented) and a
method for the authoritative NS to explicitly opt-out of such silliness.
The later actually appears to work currently (at least from my testing)
with issuing NXDOMAIN and the caching resolver switching to the full
QNAME.
.. but then again, my testing may have had flaws.
In such cases, I frequently observed DNS forwarding setups for common
RBL zones, directed against their nameservers directly. This way, rate
limits or policy-based decisions on public resolvers can be relatively
reliably avoided.
This was the environment where I noticed a script of mine
<https://github.com/twesterhever/squid-dnsbl> apparently worked with an
URIBL, but never blocked anything, despite conducting the sanity checks
mentioned in RFC 5782, section 5. Since I was completely unaware of
(strict) QNAME minimisation on the infrastructures' resolver, it took
quite a while to figure things out.
It is somewhat worse than what I suspect that you've seen. Because
these "empty non-terminals" are _empty_ they do not provide TTL data,
the RFC does not address this at all and my testing with Unbound
suggests that they are not cached. In other words, it appeared that a
query for 2.2.0.192.dsnbl.example.com would result in five queries
against the NS server for the zone 'dnsbl.example.com' to retrieve its
answer, but an immediate subsequent request for
200.2.0.192.dsnbl.example.com still has to issue five queries against
the zone as the return (null) values for 'dsnbl.example.com',
'192.dsnbl.example.com', '0.192.dsnbl.example.com', and
'2.0.192.dsnbl.example.com' were not cached.
While I certainly appreciate the patch by @ammammita
<https://github.com/ammammita>, the URIBL sanity test(s) in RFC 5782
should be changed to a more realistic query (perhaps for |example.com|)
- but that is out of scope for this issue. While people or operating
systems using an experimental DNS feature are somewhat to blame as well,
|rbldnsd|s behaviour seems to be the root cause for this. (No offense
intended, though. :-) )
My understanding from your original statement that you appeared to have
been ACLed for over query. If this is the case, then I do not see the
latest patch having any help in that regard... In fact it might
actually hurt by not returning NXDOMAIN and allowing a QNAME
minimization enabled server to walk the full tree, the remote system
will log that many more queries.
It would be nice to see selective QNAME minimisation settings possible
in Unbound, by the way. At the moment, they have not implemented that,
and it's probably hard to change.
3. i don'[t] expect regressions when rbldnsd is used through a
resolver. It will cause regressions when queried directly and
the code interprets NOERROR as a successful listing.
Agreed, and it unfortunately looks like my code /is/ interpreting empty
|NOERROR| replies as a listing
<https://github.com/twesterhever/squid-dnsbl/blob/master/dnsbl-ip.py#L205-L217>.
:-/ Thanks for bringing this aspect up.
Please also quantify the responses received to specific answers. Should
a DNSBL start sending flagged administrative codes, or the domain slips
out of registration and is picked up by a domainer wild-carding every
thing in the domain, then I would believe that things could get ugly.
4. When querying ipv6 addresses, up to 32 queries could be needed
to obtain the proper final response. This is a waste of resources.
Absolutely. This is something I did not have in mind, and unfortunately,
I cannot think of an elegant, regression-free solution to this.
Actually, I can; see errata statement above where that NXDOMAIN _should_
be viewed as opting out of the silliness. ;)
|
Hi there, Thanks for the updates to rbldnsd regarding this, when testing I've come across a few quirks, I'm hoping you can shed some light as to what I'm doing wrong. With the following setup things seem to work fine - (initially setting Filename: test_ip
Startup:
Dig Outcome
However I prefer to have my $SOA/$NS in a generic file with other records: Filename: test_generic
With the SOA line omitted from Startup:
I now notice that unless
I also notice that trying to look up the "@" record causes rbldnsd to exit:
Upon adding a third file - Filename: test_dn
Startup:
Queries for "com.bl.test.com" returns NXDOMAIN which doesn't seem to match with the expected outcome from this patch. Queries for the domains as listed seem to return fine. Interestingly if I start up rbldnsd without the other files I (./rbldnsd -b 127.0.0.1 -4 -t 600 bl.test.com:dnset:test_dn) I still get NXDOMAIN for "com.bl.test.com" Hoping you can help look into these issues, please feel free to ask for clarification on anything. |
Spotted another weird thing, which might be down to defining different dataset types on the same domain... Using similar files to above File: test_dn
On stopping rbldnsd I get the following -
However if I add the additional "generic" file: Filename test_generic
Startup -
I get an answer from dig, however it seems to be marked "NXDOMAIN"
and the log from rbldnsd -
Note the OK=0/NXD=1 |
I just became aware of this discussion and I want to provide a different angle on query name minimization, as seen from DNS world. (I'm DNS software developer, formerly working on Knot Resolver and now working on BIND and various other DNS tools.) I can perfectly understand that if you data structures provide only exact match operation returning proper NXDOMAIN is hard. If that's the case then the best DNS protocol compliant answer is so-called "NODATA", i.e. RCODE=NOERROR + empty ANSWER section. You can put SOA RR into AUTHORITY section of such answer to make it cacheable the same way as you would with RCODE=NXDOMAIN. This approach is compliant with DNS spec and allows efficient caching. You could even put larger TTL on response SOAs higher in the tree so top-level nodes like A side-note:
I understand this gets convoluted quickly. I offer help with clarifying this further. |
Quoted from RFC 7816, section 3:
It seems like
rbldnsd
shows exactly the same behaviour:As mentioned above, the latter one must return
NODATA
instead ofNXDOMAIN
as some data below127.zen.spamhaus.org
is listed in the RBL indeed. The current behaviour was found to render applications behind resolvers using strict QNAME minimization (where no fallbacks using the FQDN queried by the client in the first place happen) unusable as the resolver stops after having receivedNXDOMAIN
for the first ENT.Worse, as RFC 5782, section 5, does not specify testing entries for URIBLs below the first hierarchy (such as
dbltest.com.dbl.spamhaus.org
), it is impossible to determine whether a URIBL is actually usable or not astest.dbl.spamhaus.org
will always returnNOERROR
, while more realistic queries likeexample.com.dbl.spamhaus.org
will silently fail ascom.dbl.spamhaus.org
returnsNXDOMAIN
instead ofNODATA
.As far as I am concerned, RFC 7816 requires
rbldnsd
to returnNODATA
for Empty Non-Terminals. In my humble opinion, its' current behaviour is RFC-ignorant.The text was updated successfully, but these errors were encountered: