Posted by bert hubert
Mon, 01 Jan 2007 15:58:00 GMT
I wish everybody a very good 2007! For PowerDNS, it certainly has been a very good year.
In some (large) places, the Recursor now commands a 40% market share, while the authoritative server is also expanding its user base around the world, with multi-million domain deployments now no longer as newsworthy as they once were.
The Chaos Computer Club held its annual congress last week, and they chose the PowerDNS Recursor to provide the DNS service to go with their 10 gigabit connection. I’m pleased to report that the PowerDNS process was fired up only once, and that it held steady for the entire congress, with no complaints. This would usually not be that strange, but the CCC clientèle are among the most critical internet users to be found on the planet.
Many thanks to Stefan Schmidt and other CCC admins for their vote of confidence!
I’m working on understanding ‘Ruby on Rails’, which will probably end up as a HOWTO aimed at seasoned programmers. The internet abounds with “you won’t believe how easy Ruby on Rails is” demonstrations, but the hard truth is that below the surface, a lot of magic is happening. The kind of magic the discerning programmer wants to grasp so as to make the most of it.
A very small start to this HOWTO can be found here.
It may also allow experience programmers to teach themselves Ruby in less time than it would take them to read a 750 page book.
Posted by bert hubert
Thu, 14 Dec 2006 21:21:00 GMT
After PowerDNS 3.1.4 turned out to be boringly stable, fixing all reported crashes, I decided it was time to do the massive speedup I’d been promising people for some time.
With some help from my friends over at #offtopic2, I was able to use the TSC register of my CPU to measure down to the nanosecond how much time things were taking within PowerDNS. Previously I’d concentrated on profiling macro performance, but nanosecond resolution allows one to study fully how much time is spent within each function.
Using this technique, it became apparent we take a whopping 60 microseconds to answer even the most basic of questions. We make up for this by being pretty fast at complicated questions. But 60 microseconds mean we are limited to about 15000 questions/second, max.
First I started shaving microseconds. It turns out
snprintf is truly slow, taking up to 5 microseconds for some strings. Additionally, we wasted a lot of time on needlessly copying
The unsurpassed Boost::Multi_Index container has a spectacular feature, called ‘compatible keys’, which means we can lookup answers using a question key that is a bare piece of memory instead of a proper
std::string. This again saved a few microseconds.
Put together, this brought down the 60 usec to perhaps 40, which is nice, but not stunning.
But the big savings only came when I did the only thing that actually makes code fast: do less.
So - when encoding the answer to a question, we no longer do the whole “DNS label compression”-routine, as we know the “label” of the answer to a question can always be encoded as the fixed bytes 0xc00c - we don’t need to calculate it.
Going beyond that, when generating a simple answer, don’t generate an answer packet, but simply tack on the answer to the original question, and update the ‘answer count’.
Also, if we see we have an ‘instant answer’ available for a question, don’t bother to launch a whole ‘MThread’ to generate it, but return synchronously.
The upshot of all this is that we can now answer most questions in… 4 microseconds, down from 60. 15-fold speedups are rather rare usually.
We didn’t speedup everything that much though, only the majority of queries. However, even the uncached queries will benefit from the microsecond shaving performed earlier, and run around twice as fast.
I can’t wait to do a live benchmark on all this. I’m estimating we should now be able to do over 50000 “real” queries/second on a 3GHz P4, which would put us an order of magnitude above the open source competition, and even beat, by a large factor, the numbers I hear quoted for commercial alternatives. These are hard to compare as their numbers are under NDA.
It might not even be easy to generate that much testing data..
Will keep you posted!
Posted by bert hubert
Fri, 24 Nov 2006 21:55:00 GMT
Yesterday I visited a “software development seminar” of ASML, a rather well disguised recruiting event of this Dutch manufacturer of the world’s most advanced lithography machines.
When I studied physics, I organized the Delftse Bedrijvendagen, the then largest carreer fair for university students of The Netherlands. As part of that, I was exposed to almost all recruiters of large Dutch companies, including ASML. And the ASML people never failed to leave me light headed.
In brief, lithography is a major piece of the process of actually making chips. It is the part where you actually put the chip on the substrate, using high energy photons. Current 65nm chips consist of many layers, each of these layers needs to be overlaid with the previous one to a precision of a few nanometres.
To achieve this precision, the individual positioning tolerances of the wafer need to be exact within a nanometre. This is a stunning achievement in itself. For those of you in the non-metric world, there are around 25 million nanometres to an inch. So you should be impressed.
However, this is nothing yet. The lithography machines (‘wafer steppers’) are very expensive, as is the facility that hosts them. And, as there are many layers in a chip, the actual speed of the wafer stepper is of utmost importance.
The machines ASML builds actually illuminate the ‘reticle’ at speeds exceeding 5 metres a second. This is 11 miles/hour. At nanometre precision.
You should have progressed beyond “impressed” to “stunned” by now.
But this is nothing yet. As in microscopy, where water is used to improve resolution, it makes sense to immerse your chip in water while it is being exposed. So the ASML people do that. At nanometre precisions, at those stunning speeds.
To put things in perspective, the wafer is NOT flat to within a nanometre, it bends a bit. So to achieve the precision desired, the wafer is first scanned, so all its imprecisions can be compensated for.
Extreme stuff. I’m sure they don’t have this in “Star Trek”.
I left the event deeply confused - I’m already completely busy with everything I do, and PowerDNS is getting to be quite the empire. The rest of my business is doing great as well.
But my physics background makes me appreciate the incredible things happening over at ASML. Oh well. Like any job, I’m sure it would have downsides. Also, I’m not the kind of person to hold a regular job. But if you want to do stuff on the leading edge of technology, you should at least consider working there. I hear they have 300 vacancies planned for software engineers. They also have some blogs, by the way.
Their current challenge is to move their 15 million lines of C to a new platform that will control their next generation of devices, some of which need to move terabyte amounts of data in under a second.
Anyhow, the seminar was interesting. Tom Gilb presented his “Evolutionary Project Management” concepts, which match rather well with how I tend to manage my projects. One of his main points is that when people start to apply “waterfall” diagrams to software projects, you are lost anyhow. I thought so all along, but it is nice to hear a “guru” confirm it.
Inspired by the breakthrough technologies over at ASML, I’ve picked up my own speech recognition research again, after an 18 month hiatus. The initial results bode well. I get very good frequency and time definition on real speech, with code totalling 750 lines. I hope to get some actual recognition going in the coming week.
Posted by bert hubert
Thu, 23 Nov 2006 10:02:00 GMT
Within the last 12 months, both of my parents have passed away, both after prolonged illness. Here you can see them in happier times a few years ago.
We’ll miss them terribly.
Posted by bert hubert
Sat, 04 Nov 2006 22:26:00 GMT
Back after a hiatus of 23 days. We are still spending quite some time at the hospital, but getting used to it. After seven weeks or so the panic starts to wear off.
For all you Americans
Please go vote if this is an election year for you. Although I’m here in The Netherlands, I care as the effects of your vote are felt around the world.
Andy Tanenbaum of Minix fame is keeping a database of polls, this is his current projection for the new US senate and house:
I’m not telling you what to vote, but please don’t waste the chance to influence the shape of the world for the next few years!
Posted by bert hubert
Thu, 12 Oct 2006 19:55:00 GMT
Many thanks to my brother who read my previous post and promptly offered to procure new disks for me, they are now in production. Thanks Jaap!
C & C++
One of the things that is easy to forget about C++ is that, while not (really) a superset of C, it does offer the ability to call C functions from C++, and makes some pretty strong statements about the abilty to exchange data between the two languages.
C++ does not come with a set of ‘foundation classes’, and while the “standard template library” is strong on data structures, and algorithms to manipulate them, nothing is offered in the way of network communications infrastructure.
Many attempts have been made to rectify this situation, but these tend to be somewhat heavy handed, or overly complex.
ComboAddress. This C++ union is laid out in memory just like the venerable
struct sockaddr_in, and through its second member, also just like
The upshot is that we have a C++ union with interesting methods, that allows us to specify destination addresses, either IPv4 or IPv6, with ease, but that can also be passed to the standard Berkeley C socket functions!
These functions promptly forget they are passed a C++ union, and interpret their argument as a
struct sockaddr family member.
int sock = socket(AF_INET, SOCK_STREAM, 0);
ComboAddress ca("127.0.0.1", 6666);
if (connect(sock, ca) < 0)
unixDie("connecting to server");
‘unixDie()’ is a simple function that uses strerror to throw a runtime_error with a descriptive error message.
If you are really paying attention, you might have noticed that the ‘connection’ function above is not a real C function, and you would be right. It is a very thin wrapper that saves some typing:
inline int connect(int fd, const ComboAddress& remote)
return connect(fd, (struct sockaddr*) &remote, remote.getSocklen());
int fd = accept(sock, &ca);
if(fd >= 0)
cout << "Connection from " << ca <<endl;
The tiny bit of code that makes up the ComboAddress can be found in the PowerDNS Recursor source code. I find that it nicely bridges the vast power of the Berkeley sockets API, while taking a lot of the tedium out of calling the host of functions needed to convert between printable IP addresses, port numbers, and the actual stuff the sockets API expects.
And this is all possible because a bunch of guys with serious ‘Unix beards’ decided that C and C++ should remain family members. Thanks!
Posted by bert hubert
Fri, 06 Oct 2006 22:52:00 GMT
Well, I reported previously that the server that powers this blog fell 9 feet, and appeared to have survived? Since that event, one of the disks reported odd errors every once in a while, but those appeared to point to a bad cable. I replaced it, but no joy, problems remained.
So tonight I decide to back up that disk completely, and take it out of use. And lo, during the backup it decides to pack up! It made a noise like a passing moped, and ceased to work. Backup was almost entirely done.
I restored the backup to another computer and mounted it via NFS (over wifi no less!), and things (including this blog) are back in production again. I’ll have to buy new disks ASAP though.
PowerDNS RIPE presentation
RIPE was lots of fun, although my presentation did not go as well as I’d hoped. I’ve been distracted by grave medical problems in my family, which mean that I spend a lot of my time in the hospital. It might’ve been better to not do the presentation. Some people did tell me they enjoyed it though. Oh well.
For the first time, I’ve had the pleasure of answering a question from a webcam viewer! RIPE offers the great service that remote attendants can ask questions over IRC or Jabber, and a RIPE employee will then relay the question. A tremendous service!
Lunch at RIPE was fantastic, and it was very nice to meet many friends again. All in all a good day.
Posted by bert hubert
Sun, 01 Oct 2006 12:47:00 GMT
Quick post to say that at RIPE 53, I’ll be presenting about the PowerDNS Recursor and specifically its implementation of my Internet-Draft (“Draft RFC”).
More details in this post.
If you are at RIPE, come and say hello, or have an excellent Krasnapolski lunch with me!
Long standing bugs
Over the past few weeks some very long standing “low level irritation” PowerDNS bugs have been fixed. One of the things you learn during the maturation of software projects is that things are good once you start to get reports of obscure bugs, as this means that the big problems are out of the way!
Predictably, the bugs were related to the handling of rare errors, which also reinforces my belief that error handling of rare bugs tends to be very buggy, as these paths rarely get exercised, and when they do, people often don’t even notice the problem is more in the handling than in the error.
Don’t try to be too smart when dealing with errors!
Posted by bert hubert
Thu, 21 Sep 2006 21:16:00 GMT
Quick update on some small things.
I managed to release PowerDNS Recursor 3.1.3 which must rank as one of the most succesful releases of PowerDNS ever, as I have had zero feedback, despite a large number of downloads. Most big deployments have switched over. There is still a very small trickle of odd crashes though, but they are so rare it is hard to pin it down to anything.
Our new house has a lot going for it, except wiring possibilities. It might be possible to improve this, but right now I want nothing but the best and I’m not prepared to soil my house with badly laid cables. So it has to be wireless, which for fixed computers mostly means USB. After some searching and experimenting, I can report that zd1211 derived devices work really well using the Linux zd1211rw driver. Wireless reception depends a lot on RF conditions, having a USB receiver on a cable means you can move it around for the best reception.
The nice thing about the ZD1211 derived devices (I have two 3Com OfficeConnect adaptors) is that the authors of the driver are very approachable and work well with (and are in fact part of) the Linux kernel community. Unlike some.
It still rocks, although we haven’t had much time to empty the last boxes and buy furniture that matches the quality of the house. Sadly, we are spending a lot of time in the hospital and taking care of related things.
Posted by bert hubert
Fri, 08 Sep 2006 13:36:00 GMT
Programming is a lot of things. One of them is writing good error messages. We tend to think that errors are rare and this should of course be so. However, sometimes they are not, and in that case, good reporting is vital to quickly resolve the problem.
So, even though we should make sure errors do not happen, if they do, our error messages should be top notch.
About error messages
For operators, they are vital aids in configuring software
For system adminstrators, they show which external problems need to be resolved (disk full, network down, etc).
Should a program crash, the authors often only have error messages as clues to why this happened. Many crashes are preceded by errors that are reported. A good error can help generate a bug fix.
Taxonomy of error messages
- Configuration problems, for example, looking for a file in directory A while it resides in directory B;
- Unavailable resources (disk full, out of memory), connectivity problems;
- Should Never Happen.
These commonly occur while software is being installed and setup. Good error reporting is of utmost importance here, as it serves two purposes:
- Educating the operator about how the program functions;
- Explaining what needs to be fixed.
Ad 1, an error like “Can’t start frobnicator because the discombulator is not running” teaches the operator that a frobnicator needs a discombulator. This knowledge can of course also be gleened from the documentation, but in this case, repetition is a good thing.
Compare this error to “Can’t start process: Connection refused”, for example, and think about how helpful that is.
Ad 2, a report like “Can’t connect to product database using connection string ‘dbuser=john, dbname=changeme’: No such database” immediately tells the operator what he needs to know.
These typically occur while a program is already running and installed, but are nonetheless important. Do not log ‘Disk full’, but report ‘Disk full writing to ….’ so the operator knows which disk filled up. If a server could not be reached, log the IP address and possibly the name of the server. Any discrepancy between the two may point out a DNS configuration error.
Out of memory is typically very hard to deal with, except when something really odd was going on. A typical example is trying to allocate a ridiculous amount of memory because of an earlier error, in which case logging what memory was being allocated for might help debug the problem. It typically will not help the user of a program.
Should never happen
These are errors of the program itself. Programmers quite often test for impossible conditions, especially if they sense these might conceivably happen one day. An example might be a module that can only connect to IPv4 servers that gets confronted by an IPv6 socket, which in turn leads calls to determine the IPv4 remote address to fail.
It is tempting to be quite rude in these messages, or say stuff like “should never happen!!”, but resist these urges. One day a ‘should not happen’ error is going to be a vital debugging clue. These errors are rare, but it pays to go, well, the extra few yards to perhaps report “unexpected address family accepting connection!”.
An error message should contain:
- Who is reporting the error (which subsystem, which program, which module)
- The action that failed
- The subject of the action (a directory, a server, a port number, an IP address)
- As good an indication of the actual error as possible.
- Optional - what the program is doing about it
An excellent error message is:
Webserver can't serve page, error opening file '/var/www/index.html': Permission denied, reporting HTTP 404 error
Ad 1, it is tempting to include filenames, function names, and line numbers here. OpenSSL does this a lot, for example. However, almost none of your intended audience will be able to extract useful information from the fact that the error occurred on line 123 of ‘multiplexer.c’. Make sure the module means something to the operator. It may be as simple as the name of your program.
Ad 2, this helps the operator determine if this error might be the explanation of observed problems. An error like “Webserver failed to increase TCP buffer size, continuing with default” can immediately be ruled out as an explanation for why people can’t log in to their mail.
Ad 3, an error like “Can’t open file” on its own can mean many things. One of which might be that it is not reading the configuration file you think it is, and trying to open your index.html in ‘/usr/local/www’, and not in ‘/var/www’ like you thought you configured.
Ad 4, self explanatory. Take the trouble to convert error codes into strings. Many programmers may know what ‘errno = 111’ means ‘Connection refused’, but don’t count on your users knowing this.
Ad 5, this is a fine counterpoint to item 4. “Giving a 404” is very clear for any operator of a web server. Not all errors need a followup, so reporting what the program is doing about the error is not mandatory.
Good error messages can save your users many days of problems. And, suprisingly, you might yourself gain even more time - how well do you know the internals of your program after a few months?
So please please, both as a user and a progammer, I ask you, spend time on error messages!