Posted by bert hubert
Sun, 01 Oct 2006 12:47:00 GMT
Quick post to say that at RIPE 53, I’ll be presenting about the PowerDNS Recursor and specifically its implementation of my Internet-Draft (“Draft RFC”).
More details in this post.
If you are at RIPE, come and say hello, or have an excellent Krasnapolski lunch with me!
Long standing bugs
Over the past few weeks some very long standing “low level irritation” PowerDNS bugs have been fixed. One of the things you learn during the maturation of software projects is that things are good once you start to get reports of obscure bugs, as this means that the big problems are out of the way!
Predictably, the bugs were related to the handling of rare errors, which also reinforces my belief that error handling of rare bugs tends to be very buggy, as these paths rarely get exercised, and when they do, people often don’t even notice the problem is more in the handling than in the error.
Don’t try to be too smart when dealing with errors!
Posted in Linux, PowerDNS, Netherlabs | 8 comments
Posted by bert hubert
Thu, 21 Sep 2006 21:16:00 GMT
Quick update on some small things.
PowerDNS
I managed to release PowerDNS Recursor 3.1.3 which must rank as one of the most succesful releases of PowerDNS ever, as I have had zero feedback, despite a large number of downloads. Most big deployments have switched over. There is still a very small trickle of odd crashes though, but they are so rare it is hard to pin it down to anything.
Wireless
Our new house has a lot going for it, except wiring possibilities. It might be possible to improve this, but right now I want nothing but the best and I’m not prepared to soil my house with badly laid cables. So it has to be wireless, which for fixed computers mostly means USB. After some searching and experimenting, I can report that zd1211 derived devices work really well using the Linux zd1211rw driver. Wireless reception depends a lot on RF conditions, having a USB receiver on a cable means you can move it around for the best reception.
The nice thing about the ZD1211 derived devices (I have two 3Com OfficeConnect adaptors) is that the authors of the driver are very approachable and work well with (and are in fact part of) the Linux kernel community. Unlike some.
New house
It still rocks, although we haven’t had much time to empty the last boxes and buy furniture that matches the quality of the house. Sadly, we are spending a lot of time in the hospital and taking care of related things.
Posted in Linux, PowerDNS, Netherlabs, Life | 9 comments
Posted by bert hubert
Fri, 08 Sep 2006 13:36:00 GMT
Programming is a lot of things. One of them is writing good error messages. We tend to think that errors are rare and this should of course be so. However, sometimes they are not, and in that case, good reporting is vital to quickly resolve the problem.
So, even though we should make sure errors do not happen, if they do, our error messages should be top notch.
About error messages
Purpose
For operators, they are vital aids in configuring software
For system adminstrators, they show which external problems need to be resolved (disk full, network down, etc).
Should a program crash, the authors often only have error messages as clues to why this happened. Many crashes are preceded by errors that are reported. A good error can help generate a bug fix.
Taxonomy of error messages
- Configuration problems, for example, looking for a file in directory A while it resides in directory B;
- Unavailable resources (disk full, out of memory), connectivity problems;
- Should Never Happen.
Configuration problems
These commonly occur while software is being installed and setup. Good error reporting is of utmost importance here, as it serves two purposes:
- Educating the operator about how the program functions;
- Explaining what needs to be fixed.
Ad 1, an error like “Can’t start frobnicator because the discombulator is not running” teaches the operator that a frobnicator needs a discombulator. This knowledge can of course also be gleened from the documentation, but in this case, repetition is a good thing.
Compare this error to “Can’t start process: Connection refused”, for example, and think about how helpful that is.
Ad 2, a report like “Can’t connect to product database using connection string ‘dbuser=john, dbname=changeme’: No such database” immediately tells the operator what he needs to know.
Unavailable resources
These typically occur while a program is already running and installed, but are nonetheless important. Do not log ‘Disk full’, but report ‘Disk full writing to ….’ so the operator knows which disk filled up. If a server could not be reached, log the IP address and possibly the name of the server. Any discrepancy between the two may point out a DNS configuration error.
Out of memory is typically very hard to deal with, except when something really odd was going on. A typical example is trying to allocate a ridiculous amount of memory because of an earlier error, in which case logging what memory was being allocated for might help debug the problem. It typically will not help the user of a program.
Should never happen
These are errors of the program itself. Programmers quite often test for impossible conditions, especially if they sense these might conceivably happen one day. An example might be a module that can only connect to IPv4 servers that gets confronted by an IPv6 socket, which in turn leads calls to determine the IPv4 remote address to fail.
It is tempting to be quite rude in these messages, or say stuff like “should never happen!!”, but resist these urges. One day a ‘should not happen’ error is going to be a vital debugging clue. These errors are rare, but it pays to go, well, the extra few yards to perhaps report “unexpected address family accepting connection!”.
Guidelines
An error message should contain:
- Who is reporting the error (which subsystem, which program, which module)
- The action that failed
- The subject of the action (a directory, a server, a port number, an IP address)
- As good an indication of the actual error as possible.
- Optional - what the program is doing about it
An excellent error message is:
Webserver can't serve page, error opening file '/var/www/index.html': Permission denied, reporting HTTP 404 error
Ad 1, it is tempting to include filenames, function names, and line numbers here. OpenSSL does this a lot, for example. However, almost none of your intended audience will be able to extract useful information from the fact that the error occurred on line 123 of ‘multiplexer.c’. Make sure the module means something to the operator. It may be as simple as the name of your program.
Ad 2, this helps the operator determine if this error might be the explanation of observed problems. An error like “Webserver failed to increase TCP buffer size, continuing with default” can immediately be ruled out as an explanation for why people can’t log in to their mail.
Ad 3, an error like “Can’t open file” on its own can mean many things. One of which might be that it is not reading the configuration file you think it is, and trying to open your index.html in ‘/usr/local/www’, and not in ‘/var/www’ like you thought you configured.
Ad 4, self explanatory. Take the trouble to convert error codes into strings. Many programmers may know what ‘errno = 111’ means ‘Connection refused’, but don’t count on your users knowing this.
Ad 5, this is a fine counterpoint to item 4. “Giving a 404” is very clear for any operator of a web server. Not all errors need a followup, so reporting what the program is doing about the error is not mandatory.
Conclusion
Good error messages can save your users many days of problems. And, suprisingly, you might yourself gain even more time - how well do you know the internals of your program after a few months?
So please please, both as a user and a progammer, I ask you, spend time on error messages!
Posted in Linux, PowerDNS, Netherlabs | 12 comments
Posted by bert hubert
Mon, 14 Aug 2006 16:46:00 GMT
As previously noted, Sun is making a SunFire T2000 server available permanently for PowerDNS development, which should be good for all PowerDNS users, and probably for Sun as well. With a big PowerDNS user we are currently investigating an interaction between PowerDNS, Solaris and its Completion Ports, which may turn out not to be a PowerDNS bug. So everybody wins.
The server is arriving tomorrow at the PowerDNS offices, we hope to have it up and running shortly.
Ok, some more spiffy ‘before and after’ graphs, this time from a Solaris 10 user:
The lower graph lists the number of queries per 5 minutes. In the lowest graph, it can be seen that just before and after the maintenance period (the white bit around Mon), the number of processed queries went up substantially.
The upper graph is a plot of the load average of the server in question, which can be seen to drop visibly after this period. It is probably best to concentrate on Friday vs Wednesday. Friday, which is non-PowerDNS, did 200kqueries in 5 minutes in peak, at a load of 1.75 at peak.
The next Wednesday, we see a peak of 300kqueries in five minutes, with a load of 0.6 at peak.
If we combine these numbers, we see the efficiency (queries divided by cpu load) go up by a factor of 4. It should be noted that this is a dual CPU machine, which explains why the load can exceed 1 when running a single name server.
Thanks to Jan Gyselinck for these graphs.
Posted in Linux, PowerDNS, Netherlabs | no comments
Posted by bert hubert
Wed, 09 Aug 2006 21:04:00 GMT
(Update: I’ve upgraded my Ruby on Rails, thanks for warning me! See here)
Well, big news, we’ve decided PowerDNS needs a new homepage, and that it needs to tell you why you should run PowerDNS. All pretty obvious of course, but it took us some time to realise PowerDNS use is spreading purely based on word of mouth, and not because we promote it so well (which we don’t).
The main page currently projects a sort of post-dotbomb shareware image. The wiki is fine as it goes, but only suitable for hardcore developers. And finally, the documentation contains lots of gems on how to best use PowerDNS, but it is all very spread out.
So, until we have our new homepage, some promotion. A large PowerDNS deployment is set to make 120 servers redundant. In energy costs alone this should save around 100kW, continuously (Update: ok, perhaps a bit less. Allow me some artistic license here. If you include cooling, it is not that far off.) . For reference, that kind of power requires four of these to generate:
We might as well follow Sun’s lead and rename PowerDNS ‘The most Eco Friendly Nameserver’. EcoDNS has a nice ring to it.
Some more promotion. Switching to PowerDNS does not just save the environment, it also makes your mail go faster. A happy PowerDNS user sent us this graph:

This says, in Dutch, “average mail delivery time, in seconds”. Note the dramatic shift very early Thursday morning, from around 1.8 seconds to 0.8, and later around 0.65-ish.
The almost threefold speedup happened immediately after the switch to PowerDNS. This makes some kind of sense, with the massive amounts of spam these days, a mail server can spend an awful amount of time trying to resolve strange sender addresses, and traversing often very bad or weird reverse delegations. Spammers have also been known to try to make their DNS so misconfigured that DNS-based filtering attempts fail.
No matter what the exact cause is, there is a nearly threefold speedup. Made all the more spectacular by the non-zero based graph!
We’ll try to move the hype from here to serious white papers on the new homepage. But it feels good to share some of the improvements people are achieving by switching to PowerDNS.
Posted in Linux, PowerDNS, Netherlabs | no comments
Posted by bert hubert
Wed, 02 Aug 2006 21:54:00 GMT
Quick note. If you’ve sensed a disturbance in the force, it is because tens of millions of internet connections moved to PowerDNS just now.
I still have goosebumps.
Posted in Linux, PowerDNS, Netherlabs | 1 comment
Posted by bert hubert
Wed, 24 May 2006 04:41:00 GMT
PowerDNS 3.1 turned out to contain a brown paper bag bug that in retrospect should not hit too many people, but still. So I rushed out 3.1.1, which always leaves me with a bad feeling.
Furthermore, I’m off to Egypt for two weeks. While other people do work on PowerDNS, development will come to a nearly complete halt.
So here’s to hoping that 3.1.1 fixed more bugs than it caused..
See you in two weeks!
Posted in Linux, PowerDNS, Netherlabs, Life | 4 comments | no trackbacks
Posted by bert hubert
Fri, 12 May 2006 22:08:00 GMT
Talk about embarrassing. You may know I’m busy working on a draft-draft RFC (it becomes an ‘Internet-Draft’ once submitted) about making DNS safer through some implementation and operational guidelines (see dns-anti-spoofing.html and
dns-anti-spoofing.txt).
While writing this document, I decided to add a section on the ‘birthday paradox attack’. Reported in 2002, this is a curious mathematical phenomenon that makes spoofing a nameserver vastly easier to do.
So I wrote down the specification:
Given the above, a recursor MUST:
* Use a new random source port from its available
range for each outgoing query
* Make full use of all 16 bits of the ID field
* Assure that its choices of port and ID cannot
be predicted by an attacker having knowledge of
its (pseudo-)random generator
* Take measures to prevent having multiple equivalent
questions outstanding to any authoritative server
Which is all fine. Except that PowerDNS did not adhere to the bit about equivalent outstanding questions! PowerDNS contains a general system that prevents heaps of identical queries from leaving the server, but that doesn’t translate well into ‘standardese’, you’d get something like ‘recursors MUST have a system that sort of prevents most of the identical queries’.
So, I added ‘query-chaining’ to PowerDNS, which detects this situation and puts an MThread to sleep when it tries to send out a duplicate question. When the answer to the initial question arrives, they all get woken up.
Due to the throttling code already in place, and the source port randomisation, this does not improve our security significantly, but at least I’m now in compliance of my own draft-draft RFC :-)
Code is linked here.
Posted in PowerDNS, Netherlabs | 1 comment | no trackbacks
Posted by bert hubert
Tue, 09 May 2006 20:15:00 GMT
I’ve long been a somewhat active member of the relevant DNS mailing lists, ‘namedroppers’ and ‘dnsop’, both affiliated with the IETF DNS workgroups.
I consider myself a bit of an outcast in the DNS community as I don’t sing the praises of DNSSEC, nor BIND, but I suspect this is not entirely fair as there are quite a number of people who are far more outcast than I am. So I suspect I’m on the fringe of the DNS community in the sense that I incidentally take part in useful email discussion, either on list or privately with relevant parties.
I recently called upon nameserver authors and operators to either upgrade their nameserver so it performs adequate anti-spoofing measures, or switch to a nameserver implementation that does (like tinydns or of course PowerDNS).
This call fell on very deaf ears it appears. The BIND people promised to look into it but as noted then, without an apparant sense of urgency. Not a lot has happened since, except that I’ve reiterated my recommendation privately to a number of relevant people.
In the meantime, I’ve been told the Microsoft nameserver is about 4 times easier to spoof than BIND, but I’ve been unable to verify this.
So, I did what I never thought I’d do, I wrote something intended to be an RFC. In short, this RFC specifies that a recursor MUST implement adequate anti-spoofing measures, and details what this entails.
Read all about it as old school text or rendered as pretty HTML. The RFC-compliant output is made possibly by the interesting but quirky tool xml2rfc.
I’ll spend some more time polishing the document before submitting it as an Internet Draft. I also need to figure out the correct procedure to set things in motion.
I sincerely hope nameservers that are easy to spoof clean up their act quickly, hopefully even before my draft hits the standards track.
Posted in PowerDNS, Netherlabs | 12 comments | no trackbacks
Posted by bert hubert
Sat, 06 May 2006 21:58:00 GMT
Welcome back after this 9-day hiatus from my Blog!
Ok, what has happened. I had two good experiences with local electronics stores here in Delft. Goris was unable to provide me with the proper cable to hook up my shiny new WiFi directional antenna, but they referred me to HEC, which did have the components to make the cable. My skills with the soldering iron are humorous at best. However the people at HEC kindly offered to make the cable for me! So now I finally have a working combination of antenna, cable and adapter. And to make things perfect, Goris allowed me to test my new WiFi card to verify Linux compatability. Luckily it all works. I hope to hook up pahu tomorrow.
Slight damper on today is that I was fined for driving my bicycle through a street here in Delft that turned out to be for pedestrians only. 30 euros too. I normally am all in favour of the rule of law but this makes little sense. It is fortunate therefore that the actual fine contained a number of errors which I am sure invalidate it, so I wasted no time in drafting a written protest. I’m not usually like this but I was pissed of at the inanity of this fine.
PowerDNS & Windows
As staunch a supporter as I am of Open Source, my technology wants to go places. So, I downloaded the ‘free’ version of Visual Studio Express 2005 from Microsoft. And a fine compiler it is! I had fixed a bunch of initial incompatabilities using the (also fine) Minimalistic GCC for Windows. I think this is the first Microsoft C++ compiler that can really be taken seriously. VC++ debugging mode found two real bugs in PowerDNS, which motivated me to turn on the ‘debugging mode’ of the G++ libstdc++ as well, which uncovered two further bugs!
This strengthens my feelings that porting to different platforms helps uncover bugs which aren’t (yet) a problem but might be.
Ahu’s quick guide to porting to windows:
- Use VC++ 2005, earlier versions have a lot more problems with constructions g++ accepts. It also appears that VC++ 2005 is smart with respect to UNIX/DOS line endings.
- Separate the really different things to different files, which share one header file. Don’t make #ifdef soup!
- Make a single include file that includes OS-dependent include files (like
windows.h).
- On windows, one can only write and read from sockets using
send(to) and recv(from). As these functions work for UNIX as well, use these functions exclusively on sockets.
- To close a socket under windows, you need
closesocket() and not close. Candidate for the file mentioned under 1.
- Windows has different
errno traditions. All network (‘winsock’) related errors need WSAGetLastError(). See here.
- Use ‘Tortoise’ Subversion for revision control, integrates really well with both Windows and UNIX. Also smart about line endings.
- If, as for me, your prime development platform is UNIX, install the MINGW crosscompiler so you can easily verify the code at least compiles for Windows. This helps prevent code-rot at an early stage.
- Get a Windows buddy :-) Many thanks to Michel Stol, who is far more at home in Windows than I am.
PowerDNS 3.1
I hope to release PowerDNS 3.1 shortly, and make things settle down a bit then. Since the previous blog post, I added full blown IPv6 outgoing support, with IPv6 achieving full parity - any IPv6 nameserves that are faster than their IPv4 partners will receive more queries.
The ‘–export-etc-hosts’ stuff also works fine now, which should allow many networks to simply run unconfigured, save for that option, and have everything Just Work.
For more, see here.
Posted in Linux, PowerDNS, Netherlabs, Life | 3 comments | no trackbacks