<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>bert hubert finally blogs: Category Linux</title>
    <link>http://blog.netherlabs.nl/articles/category/linux</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>code, musings and more</description>
    <item>
      <title>Some debugging techniques, and &amp;quot;C++ introspection&amp;quot;</title>
      <description>&lt;p&gt;After too much posting on IETF mailing lists, and not achieving anything, I&amp;#8217;ve gone back to coding a bit more.&lt;/p&gt;

&lt;p&gt;There are two things I want to share - the first because I had a devil of a time figuring out how to do something, and I hope that posting here will help fellow-sufferers find the solution via Google. The second thing I want to talk about because, and this is getting to be rare, I programmed something &lt;em&gt;cool&lt;/em&gt;, and I just need to tell someone about it.&lt;/p&gt;

&lt;p&gt;I pondered explaining it to my lovely son Maurits (4 months old today), but I don&amp;#8217;t want to ruin his brain. 
&lt;center&gt;
&lt;img src="http://ds9a.nl/maurits-6-september.jpg" width=90%&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;h2&gt;Debugging iterators&lt;/h2&gt;

&lt;p&gt;In most programming languages there are a lot of things that compile just fine, or generate no parsing errors at runtime, but are still accidents waiting to happen.&lt;/p&gt;

&lt;p&gt;Tools abound to expose such silent errors, usually at a horrendous performance cost. But this is fine, as errors can be found by the developer, and fixed before release.&lt;/p&gt;

&lt;p&gt;As part of our arsenal, we have the veritable &lt;a href="http://www.valgrind.org"&gt;Valgrind&lt;/a&gt; that detects things such as reading from memory that had not previously been written to. In addition, other tricks are available, such as changing functions that &amp;#8216;mostly do X, and rarely Y&amp;#8217; so that they always to Y. This quickly finds programs that skipped dealing with Y (which might be a rare error condition, or realloc(2) returning a new memory address for your data).&lt;/p&gt;

&lt;p&gt;Finally, many programming environments by default perform very little checking (in the interest of speed) - for example, they will gladly allow you to compare something that points to data in collection A to something that points to collection B - a comparison that never makes sense, classical apples and oranges.&lt;/p&gt;

&lt;p&gt;My favorite C++ compiler, G++, comes with so called &amp;#8216;debugging iterators&amp;#8217; that are very strict - anything that is not absolutely correct becomes an error, sometimes at compile time, sometimes at runtime.&lt;/p&gt;

&lt;p&gt;Together with Valgrind, this is one of the techniques I like to whip out when the going gets tough. &lt;/p&gt;

&lt;p&gt;Sadly, Debugging iterators (which are turned on by adding &lt;a href="http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt12ch30s03.html"&gt;-D&lt;em&gt;GLIBCXX&lt;/em&gt;DEBUG&lt;/a&gt; conflict with one of my favorite C++ libraries, &lt;a href="http://www.boost.org"&gt;Boost&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To make a long story short, to compile a version of Boost with debugging iterators, issue:&lt;/p&gt;

&lt;pre&gt;
$ bjam define=_GLIBCXX_DEBUG
&lt;/pre&gt;

&lt;p&gt;This single line of text may not look all that important, but it took me half a day of debugging to figure this out. So if you get this error:&lt;/p&gt;

&lt;pre&gt;
dnsgram.o:
(.rodata._ZTVN5boost15program_options11typed_valueISscEE[vtable for 
boost::program_options::typed_value&lt;std::basic_string&lt;char, 
std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;, char&gt;]+0x18): 
undefined reference to 
`boost::program_options::value_semantic_codecvt_helper&lt;char&gt;::parse(boost::any&amp;, 
std::__debug::vector&lt;std::basic_string&lt;char, std::char_traits&lt;char&gt;, 
std::allocator&lt;char&gt; &gt;, std::allocator&lt;std::basic_string&lt;char, 
std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt; &gt; &gt; const&amp;, bool) const'
&lt;/pre&gt;

&lt;p&gt;Then compile your own version of boost as outlined above.&lt;/p&gt;

&lt;h2&gt;C++ Introspection &amp;amp; Statistics&lt;/h2&gt;

&lt;p&gt;C++ is an old-school language, perhaps the most modern language of the old school. This means that it sacrifices a lot of things to allow programs to run at stunning &amp;#8216;near bare metal&amp;#8217; speeds. One of the things that C++ does not offer therefore is &amp;#8216;introspection&amp;#8217;&lt;/p&gt;

&lt;p&gt;What this means is that if you have a class called &amp;#8220;ImportantClass&amp;#8221;, that class does not know its own name at runtime. When a program is running, it is not possible to ask by name for an &amp;#8220;ImportantClass&amp;#8221; to be instantiated.&lt;/p&gt;

&lt;p&gt;If you need this ability, you need to register your ImportantClass manually by its name &amp;#8220;ImportantClass&amp;#8221;, and store a pointer to a function that creates an ImportantClass for you when you need it.&lt;/p&gt;

&lt;p&gt;Doing so manually is usually not a problem, except of course when it is. In PowerDNS, I allocate a heap (or a stack even) of runtime statistics. Each of those statistics is a variable (or a function) with a certain name.&lt;/p&gt;

&lt;p&gt;In more modern languages, it would probably be easy to group all these variables together (with names like numQueries, numAnswers, nomUDPQueries etc), and allow these statistics to be queried using their names. So, an external program might call &amp;#8216;get stat numQueries&amp;#8217;, and PowerDNS would look up the numQueries name, and return its value.&lt;/p&gt;

&lt;p&gt;No such luck in C or C++!&lt;/p&gt;

&lt;p&gt;So - can we figure out something smart, say, with a macro? Yes and no. The problem is that when we declare a variable in C which we want to be accessible from elsewhere in the program, it needs to happen either inside a struct or class, or at global scope. This in turn means that we can&amp;#8217;t execute code there. So, what we&amp;#8217;d like to do, but can&amp;#8217;t is:&lt;/p&gt;

&lt;pre&gt;
struct Statistics {
         uint64_t numPackets;
         registerName(&amp;numPackets, "numPackets");
         uint64_t numAnswers;
         registerName(&amp;numAnswers, "numAnswers");
} stats;
&lt;/pre&gt;

&lt;p&gt;stats.numPackets is indeed available, but the line after its definition will generate an error. This is sad, since the above could easily be generated from a macro, so we could do:&lt;/p&gt;

&lt;p&gt;DEFINESTAT(numPackets, &amp;#8220;Number of packets received&amp;#8221;);&lt;/p&gt;

&lt;p&gt;Which would simultaneously define numPackets, as well as make it available as &amp;#8220;numPackets&amp;#8221;, and store a nice description of it somewhere.&lt;/p&gt;

&lt;p&gt;But alas, this is all not possible because of the reasons outlined above.&lt;/p&gt;

&lt;p&gt;So - how do we separate the data structure from the &amp;#8216;registerName()&amp;#8217; calls, while retaining the cool &amp;#8216;DEFINESTAT&amp;#8217; feature where everything is in one place?&lt;/p&gt;

&lt;p&gt;In C++, files can be included using the #include statement. Most of the time, this is used to include so called &amp;#8216;header&amp;#8217; files - but nothing is stopping us from using this feature for our own purposes.&lt;/p&gt;

&lt;p&gt;The trick is to put all the &amp;#8216;DEFINESTAT&amp;#8217; statements in an include file, and include it not once, but twice!&lt;/p&gt;

&lt;p&gt;First we do:&lt;/p&gt;

&lt;pre&gt;
#define DEFINESTAT(name, desc) uint64_t name;
struct Statistics {
#include "statistics.h"
} stats;
&lt;/pre&gt;

&lt;p&gt;This defines the Statistics struct, containing all the variables we want to make available. These are nicely expanded using our DEFINESTAT definition.&lt;/p&gt;

&lt;p&gt;Secondly, we #undefine DEFINESTAT again, and redefine it as:&lt;/p&gt;

&lt;pre&gt;
#define DEFINESTAT(name, desc) registerName(&amp;stats.name, #name, desc);
&lt;/pre&gt;

&lt;p&gt;Then we insert this in a function:&lt;/p&gt;

&lt;pre&gt;
void registerAllNames()
{
#include "statistics.h"
}
&lt;/pre&gt;

&lt;p&gt;This will cause the same statistics.h file to be loaded again, with the same DEFINESTAT lines in there, but this time DEFINESTAT expands to a call that registers each variable, its name (#name expands to &amp;#8220;name&amp;#8221;), and its description.&lt;/p&gt;

&lt;p&gt;The rest of our source can now call &amp;#8216;stats.numPackets++&amp;#8217;, and if someone wants to query the &amp;#8220;numPackets&amp;#8221; variable, it is available easily enough through its name since it has been registered using registerName.&lt;/p&gt;

&lt;p&gt;The upshot of this all is that we have gained the ability to &amp;#8216;introspect&amp;#8217; our Statistics structure, without any runtime overhead nor any further language support.&lt;/p&gt;

&lt;p&gt;As stated above, more modern languages make this process easier.. but not as fast!&lt;/p&gt;

&lt;p&gt;I hope you enjoyed this arcane coolness as much as I did. But I doubt it :-)&lt;/p&gt;</description>
      <pubDate>Thu, 18 Sep 2008 21:44:00 +0200</pubDate>
      <guid isPermaLink="false">urn:uuid:2dea2a0a-d535-458b-be86-f7daf276eeaf</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2008/09/18/some-debugging-techniques-and-c-introspection</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Netherlabs</category>
    </item>
    <item>
      <title>(a)synchronous programming</title>
      <description>&lt;p&gt;Ok, I&amp;#8217;m going to lecture a bit, a bad habit of mine. The summary is that an important enhancement of the Linux kernel has been proposed, but in order to understand the significance of this enhancement,  you need a lot of theory, which follows below.&lt;/p&gt;

&lt;p&gt;I use the word &amp;#8220;computer&amp;#8221; sometimes when I properly mean &amp;#8220;the operating system&amp;#8221;. This exposes a problem of this post, I&amp;#8217;m trying to explain something deeply theoretical to a general audience. Perhaps it didn&amp;#8217;t work. See for yourself.&lt;/p&gt;

&lt;h2&gt;Doing many things at once&lt;/h2&gt;

&lt;p&gt;People generally tend not to be very good at doing many thing at once, and surprisingly, computers are not much different in this respect. &lt;/p&gt;

&lt;p&gt;First about human beings. We can do one thing at a time, reasonably well. There are people that claim they can multi-task, but if you look into it, that generally means doing one thing that is really simple, while simultaneously talking on the phone.&lt;/p&gt;

&lt;p&gt;This is exemplified by how we answer a second phone call, ie, by saying &amp;#8220;The other line is ringing, I&amp;#8217;ll call you back&amp;#8221;, or conversely, telling the other line they&amp;#8217;ll have to wait.&lt;/p&gt;

&lt;p&gt;We emphatically don&amp;#8217;t try to have two conversations at once, and even if we had two mouths, we still wouldn&amp;#8217;t attempt it. &lt;/p&gt;

&lt;p&gt;Let&amp;#8217;s take a look at a web server, the program that makes web pages available to internet browsers. The basic steps are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Wait for new connections from the internet&lt;/li&gt;
&lt;li&gt;Once a new connection is in, read from it which page it wants to see (for example, &amp;#8216;GET http://blog.netherlabs.nl/ HTTP/1.1&amp;#8217;).&lt;/li&gt;
&lt;li&gt;Find that page in the computer&lt;/li&gt;
&lt;li&gt;Send it to the web browser that connected to us&lt;/li&gt;
&lt;li&gt;Go to 1.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Compare this to answering a phone call, step 1 is the part where you wait for the phone to ring, and answering it when it does. Step 2 is hearing what the caller wants, step 3 is figuring out the answer to the query, 4 is sharing that answer.&lt;/p&gt;

&lt;p&gt;This all seems natural to us, as it is the way we think. And programmers, contrary to what people think, are human beings, too.&lt;/p&gt;

&lt;p&gt;Where this simple process breaks down is that, much like a regular phone call, we can only serve a new web page once the old one is done sending. &lt;/p&gt;

&lt;p&gt;And here is where things get interesting - although we people have a hard time doing multiple things at once, we can give the problem to the computer.&lt;/p&gt;

&lt;p&gt;What is the easiest way of doing so? Well, if we want to increase the capacity of a telephone service we do so.. by adding people. So on the programming side of things, we do the same thing, only virtually: we order the computer (or more exactly, the operating system) to split itself in two!&lt;/p&gt;

&lt;p&gt;The new list of steps now becomes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Wait for new connections from the internet&lt;/li&gt;
&lt;li&gt;Once a new connection is in, split the computer in two. &lt;/li&gt;
&lt;li&gt;One half of the computer goes back to step 1, the other half continues this list&lt;/li&gt;
&lt;li&gt;(2) Read from it which page it wants to see (for example, &amp;#8216;GET http://blog.netherlabs.nl/ HTTP/1.1&amp;#8217;).&lt;/li&gt;
&lt;li&gt;(2) Find that page&lt;/li&gt;
&lt;li&gt;(2) Send it to the web browser&lt;/li&gt;
&lt;li&gt;(2) Done - remove this &amp;#8220;half&amp;#8221; of the computer&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I&amp;#8217;ve prefixed the things the second computer does with &amp;#8221;(2)&amp;#8221; . This looks like the best of both worlds. We can &amp;#8220;serve&amp;#8221; many web pages at the same time, and we didn&amp;#8217;t need to do complicated things. In other words, we could continue thinking like human beings, and use our intuition, by thinking of the analogies with answering phone calls.&lt;/p&gt;

&lt;p&gt;So, are we done now? Sadly no. What basically has happened is that we have invoked a piece of magic: let&amp;#8217;s split the computer in two. That is all fine, but somebody has to do the splitting. This job is farmed out to the CPU (the processor) and the operating system (Windows, Linux etc), and they have to deal with making sure it appears the computer can do two things at the same time.&lt;/p&gt;

&lt;p&gt;Because the truth is.. people can&amp;#8217;t do it, and neither can computers. They fake it. &lt;/p&gt;

&lt;p&gt;This faking comes at a cost, incurred both while splitting the computer (&amp;#8220;forking&amp;#8221;), and by making the computer juggle all its separate parts. Finally, it turns out that practically speaking, you can divide a computer up into only a limited number of parts before the charade falls down.&lt;/p&gt;

&lt;p&gt;Busy websites have tens of millions of visitors, we&amp;#8217;d need to be able to split the computer into at least that many parts, while in practice the limit lies at perhaps 100,000 slices, if not less.&lt;/p&gt;

&lt;h1&gt;Now what&lt;/h1&gt;

&lt;p&gt;Several solutions to this problem have been invented. Some involve not quite splitting up the entire computer and making split parts share more of the resources (like for example, memory). This is called &amp;#8216;threading&amp;#8217;. Perhaps this could be compared with not hiring more people to answer the telephone, but instead giving the people you have more heads, so as to save money.&lt;/p&gt;

&lt;p&gt;In the end, all these solutions run into a brick wall: it is hard to maintain the illusion that the computer can do multiple things at the same time, AND have it actually do a million things at the same time.&lt;/p&gt;

&lt;p&gt;So in the end, we have to bite the bullet, and just make sure the program itself can handle many many things at once, without needing the magic of pretending the computer can do it for us.&lt;/p&gt;

&lt;h1&gt;&amp;#8220;Asynchronous programming&amp;#8221;&lt;/h1&gt;

&lt;p&gt;This is where things get hard, and this is to be expected, as it was our basic premise that people can&amp;#8217;t do multiple things at the same time, and what&amp;#8217;s worse, they have a hard time even thinking about what it would be like.&lt;/p&gt;

&lt;p&gt;The new algorithm looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Instruct the computer to tell us when &amp;#8220;something has happened&amp;#8221;&lt;/li&gt;
&lt;li&gt;Figure out what happened:
&lt;ul&gt;
&lt;li&gt;If there is a new connection, instruct the computer that from now on, it should tell us if new data arrived on that connection&lt;/li&gt;
&lt;li&gt;If something has happened to one of those connections we&amp;#8217;ve told the computer about, read the data sent to us on that connection. Then find the information requested on that connection, and instruct the computer to tell us when there is &amp;#8220;room&amp;#8221; to send that data&lt;/li&gt;
&lt;li&gt;If the computer told us there was &amp;#8220;room&amp;#8221;, send the data that was previously requested on that connection. If we are done sending all the data, tell the computer to disconnect, and no longer inform us of the state of the connection.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Go back to 1.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If this feels complicated, you&amp;#8217;d be right. However, this is how all very high performance computer applications work, because the &amp;#8220;faking&amp;#8221; described above doesn&amp;#8217;t really &amp;#8220;scale&amp;#8221; to tens of thousands of connections.&lt;/p&gt;

&lt;p&gt;How does this translate to the telephone situation? It would be like we have lots of small answering machines, that lots of callers can talk to at the same time. Whenever someone has finished a question, the operator would listen to that answering machine, and leave the answer on the machine, and go on to the next machine that has a finished message.&lt;/p&gt;

&lt;p&gt;From this description, it is clear it would not work faster that way if you&amp;#8217;d try it for real. However, in many countries, if you call a directory service to find a telephone number, you&amp;#8217;ll get half of this. Your call is answered by a real human being, who asks you questions to figure out which phone number you are looking for. But once it has been found, the operator presses a button, and the result of your query is sent to a computer, which then reads it to you, allowing the operator to already start answering a new call. Rather smart.&lt;/p&gt;

&lt;h1&gt;Something in between&lt;/h1&gt;

&lt;p&gt;If the previous bit was hard to understand, I make no apologies, this is just how complicated things are in the world of computing. However, we programmers also hate to deal with complicated things, so we try to avoid stuff like this.&lt;/p&gt;

&lt;p&gt;People have invented many ways of allowing programmers to think &amp;#8216;linearly&amp;#8217;, as if only a single thing is happening at the same time, without having to split the entire computer.&lt;/p&gt;

&lt;p&gt;One way of doing this is having a facade that makes things go linearly, until the program has to wait for something (a new connection, &amp;#8220;room&amp;#8221; to send data etc), and then switch over to processing another connection. Once that connection has to wait for something, chances are that what our earlier &amp;#8216;wait&amp;#8217; was waiting for has happened, and that program can continue.&lt;/p&gt;

&lt;p&gt;This truly offers us the best of both worlds: we can program as if only a single thing is happening at the same time, something we are used to, but the moment the computer has to wait for something, we are switched automatically to another part of the program, that is also written as if it is the only thing happening at the same time.&lt;/p&gt;

&lt;p&gt;Actually making this happen is pretty hard however, because traditional computer programming environments don&amp;#8217;t clearly separate actions that could lead to &amp;#8220;waiting&amp;#8221; from actions that should happen instantly.&lt;/p&gt;

&lt;p&gt;A prime example of the first kind of action is &amp;#8220;waiting for a new connection&amp;#8221; - this might in theory take forever, especially if your website is really unpopular. &lt;/p&gt;

&lt;p&gt;Things that should happen instantly include for example asking the computer what time it thinks it is.&lt;/p&gt;

&lt;p&gt;Traditional operating systems can be instructed to be mindful of new incoming connections, and not keep the program waiting for them. This is what we described in the complicated &amp;#8220;if X happened, if Y happened&amp;#8221; scenario above.&lt;/p&gt;

&lt;p&gt;They can also do the same for reading from the network and writing to the network, both things that might take time. This means you can ask the operating system &amp;#8216;let me know when I can read so I don&amp;#8217;t have to wait for it, and I can process other connections in the meantime&amp;#8217;.&lt;/p&gt;

&lt;p&gt;Furthermore, there are some &lt;em&gt;limited&lt;/em&gt; tricks to do the same for reading a file. The problem is that back in the 1970s when most operating system theory was being invented, disks were considered so fast, nobody thought it possible you&amp;#8217;d ever need to meaningfully wait for one. Of course disks weren&amp;#8217;t faster back then, but computers were &lt;em&gt;slower&lt;/em&gt;, and massively so. So by comparison, disks were really fast. &lt;/p&gt;

&lt;p&gt;The upshot is that in most operating systems, disk reads are grouped with &amp;#8220;stuff that should happen instantly&amp;#8221;, whereas every computer user by now has experienced this is emphatically not the case.&lt;/p&gt;

&lt;p&gt;Modern operating systems offer only a limited solution to this problem, called &amp;#8216;asynchronous input/output&amp;#8217;, which allows one to more or less tell the computer to notify us when it has read a certain piece of data from disk.&lt;/p&gt;

&lt;p&gt;However, it doesn&amp;#8217;t offer the same facility for doing a lot of other things that might take time, like &lt;em&gt;finding&lt;/em&gt; the file in the first place, or opening it. Things that in the real world take a lot of time.&lt;/p&gt;

&lt;p&gt;So, we can&amp;#8217;t truly enjoy the best of both worlds as sketched above, which would mean the programmer could write simple programs, which would be switched every time his program has to wait for something.&lt;/p&gt;

&lt;h2&gt;Enter &amp;#8216;Generic AIO&amp;#8217;&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://zabbo.net"&gt;Zach Brown&lt;/a&gt;, who is employed by Oracle to work on Linux, has now dreamed up something that appears to never have been done before: &lt;em&gt;everything&lt;/em&gt; can now be considered something that &amp;#8220;might take time&amp;#8221;. &lt;/p&gt;

&lt;p&gt;This means that you can ask Linux to find a certain file for you, and immediately allows you to process other connections that need attention. Once the operating system has found the file for you, it is available for you without waiting.&lt;/p&gt;

&lt;p&gt;Although almost every advance in operating system design has at one point been researched already, this approach appears to be rather revolutionary.&lt;/p&gt;

&lt;p&gt;It has ignited vigorous discussion within the Linux community about the feasibility of this approach, and if it truly is the dreamt of &amp;#8220;best of both worlds&amp;#8221;, but to this author, it surely looks like a breakthrough.&lt;/p&gt;

&lt;p&gt;Especially since it unites the worlds of &amp;#8220;waiting on a read/write from the network&amp;#8221; with &amp;#8220;waiting for a file to be read from disk&amp;#8221;. &lt;/p&gt;

&lt;p&gt;Time will tell if  &amp;#8220;Generic AIO&amp;#8221; will become part of Linux. In the meantime, you can read more about it on &lt;a href="http://lwn.net/Articles/219954/"&gt;LWN&lt;/a&gt;.&lt;/p&gt;</description>
      <pubDate>Sun, 04 Feb 2007 13:14:00 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:2f317d85-d240-4254-a961-235eecbaeb1e</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2007/02/04/a-synchronous-programming</link>
      <category>Linux</category>
      <category>Netherlabs</category>
      <category>Life</category>
    </item>
    <item>
      <title>This draft is a work item of the DNS Extensions Working Group of the IETF!</title>
      <description>&lt;p&gt;The workings of the Internet are described, or even proscribed, by the so called &amp;#8216;Requests For Comments&amp;#8217;, or RFCs. These are the laws of the internet.&lt;/p&gt;

&lt;p&gt;Today the IETF DNS Extensions working group accepted an &amp;#8220;Internet-Draft&amp;#8221;  &lt;a href="http://virtu.nl"&gt;Remco van Mook&lt;/a&gt; and I have been working on. And the cool bit is that over time, many such accepted &amp;#8220;Internet-Drafts&amp;#8221; turn into RFCs!&lt;/p&gt;

&lt;p&gt;Read about it what our draft does 
&lt;a href="http://blog.netherlabs.nl/articles/2006/05/09/i-bit-the-bullet-and-wrote-an-rfc-to-be"&gt;here&lt;/a&gt;
and &lt;a href="http://blog.netherlabs.nl/articles/2006/05/13/in-violation-of-my-own-draft-draft-rfc"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The actual Internet-Draft can be found &lt;a href="http://www.ietf.org/internet-drafts/draft-ietf-dnsext-forgery-resilience-00.txt"&gt;over at the IETF&lt;/a&gt;, or over here &lt;a href="http://ds9a.nl/tmp/draft-ietf-dnsext-forgery-resilience.html"&gt;as pretty HTML&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In short, this RFC documents and standardises some of the stuff DJBDNS and PowerDNS have been doing to make the DNS a safer place.&lt;/p&gt;

&lt;p&gt;Besides the fact that it is important to update the DNS standards to reflect this practice, it is also rather a cool thought to actually be writing an RFC, especially one that has the magic stanzas &amp;#8220;Standards Track&amp;#8221; and &amp;#8220;Updates 1035&amp;#8221; in it.&lt;/p&gt;

&lt;p&gt;So we are well pleased! Over the coming months we&amp;#8217;ll have to tune the draft so it confirms with the consensus of the DNSEXT working group, and hopefull somewhere around March, it will head towards the IESG, after which an actual RFC should be issued.&lt;/p&gt;

&lt;p&gt;Exciting!&lt;/p&gt;</description>
      <pubDate>Fri, 12 Jan 2007 22:16:00 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:332bd24c-7b93-4fd2-a46e-17e5eb75618a</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2007/01/12/this-draft-is-a-work-item-of-the-dns-extensions-working-group-of-the-ietf</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Netherlabs</category>
      <category>Life</category>
    </item>
    <item>
      <title>Wishing you a good 2007!</title>
      <description>&lt;p&gt;I wish everybody a very good 2007! For PowerDNS, it certainly has been a very good year.&lt;/p&gt;

&lt;p&gt;In some (large) places, the Recursor now commands a 40% market share, while the authoritative server is also expanding its user base around the world, with multi-million domain deployments now no longer as newsworthy as they once were.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://en.wikipedia.org/wiki/Chaos_Computer_Club"&gt;Chaos Computer Club&lt;/a&gt; held its &lt;a href="http://en.wikipedia.org/wiki/Chaos_Communication_Congress"&gt;annual congress&lt;/a&gt; last week, and they chose the PowerDNS Recursor to provide the DNS service to go with their 10 gigabit connection. I&amp;#8217;m pleased to report that the PowerDNS process was fired up only once, and that it held steady for the entire congress, with no complaints. This would usually not be that strange, but the CCC clientèle are among the most critical internet users to be found on the planet.&lt;/p&gt;

&lt;p&gt;Many thanks to Stefan Schmidt and other CCC admins for their vote of confidence!&lt;/p&gt;

&lt;h2&gt;Rails&lt;/h2&gt;

&lt;p&gt;I&amp;#8217;m working on understanding &amp;#8216;Ruby on Rails&amp;#8217;, which will probably end up as a HOWTO aimed at seasoned programmers. The internet abounds with &amp;#8220;you won&amp;#8217;t believe how easy Ruby on Rails is&amp;#8221; demonstrations, but the hard truth is that below the surface, a lot of magic is happening. The kind of magic the discerning programmer wants to grasp so as to make the most of it.&lt;/p&gt;

&lt;p&gt;A very small start to this HOWTO can be found &lt;a href="http://ds9a.nl/ror-hard-way"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It may also allow experience programmers to teach themselves Ruby in less time than it would take them to read a 750 page book.&lt;/p&gt;</description>
      <pubDate>Mon, 01 Jan 2007 16:58:00 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:c95eab70-b67c-4470-8e15-afcc75dd1886</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2007/01/01/wishing-you-a-good-2007</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Netherlabs</category>
      <category>Life</category>
    </item>
    <item>
      <title>PowerDNS speedups</title>
      <description>&lt;p&gt;After PowerDNS 3.1.4 turned out to be boringly stable, fixing all reported crashes, I decided it was time to do the massive speedup I&amp;#8217;d been promising people for some time.&lt;/p&gt;

&lt;p&gt;With some help from my friends over at &lt;a href="http://offtopic2.net"&gt;#offtopic2&lt;/a&gt;, I was able to use the TSC register of my CPU to measure down to the nanosecond how much time things were taking within PowerDNS. Previously I&amp;#8217;d concentrated on profiling macro performance, but nanosecond resolution allows one to study fully how much time is spent within each function.&lt;/p&gt;

&lt;p&gt;Using this technique, it became apparent we take a whopping 60 microseconds to answer even the most basic of questions. We make up for this by being pretty fast at complicated questions. But 60 microseconds mean we are limited to about 15000 questions/second, max.&lt;/p&gt;

&lt;p&gt;First I started shaving microseconds. It turns out &lt;code&gt;snprintf&lt;/code&gt; is truly slow, taking up to 5 microseconds for some strings. Additionally, we wasted a lot of time on needlessly copying &lt;code&gt;std::strings&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The unsurpassed Boost::Multi_Index container has a spectacular feature, called &amp;#8216;compatible keys&amp;#8217;, which means we can lookup answers using a question key that is a bare piece of memory instead of a proper &lt;code&gt;std::string&lt;/code&gt;. This again saved a few microseconds.&lt;/p&gt;

&lt;p&gt;Put together, this brought down the 60 usec to perhaps 40, which is nice, but not stunning.&lt;/p&gt;

&lt;p&gt;But the big savings only came when I did the only thing that actually makes code fast: do less.&lt;/p&gt;

&lt;p&gt;So - when encoding the answer to a question, we no longer do the whole &amp;#8220;DNS label compression&amp;#8221;-routine, as we know the &amp;#8220;label&amp;#8221; of the answer to a question can always be encoded as the fixed bytes 0xc00c - we don&amp;#8217;t need to calculate it.&lt;/p&gt;

&lt;p&gt;Going beyond that, when generating a simple answer, don&amp;#8217;t generate an answer packet, but simply tack on the answer to the original question, and update the &amp;#8216;answer count&amp;#8217;.&lt;/p&gt;

&lt;p&gt;Also, if we see we have an &amp;#8216;instant answer&amp;#8217; available for a question, don&amp;#8217;t bother to launch a whole &amp;#8216;MThread&amp;#8217; to generate it, but return synchronously.&lt;/p&gt;

&lt;p&gt;The upshot of all this is that we can now answer most questions in&amp;#8230; 4 microseconds, down from 60. 15-fold speedups are rather rare usually.&lt;/p&gt;

&lt;p&gt;We didn&amp;#8217;t speedup everything that much though, only the majority of queries. However, even the uncached queries will benefit from the microsecond shaving performed earlier, and run around twice as fast.&lt;/p&gt;

&lt;p&gt;I can&amp;#8217;t wait to do a live benchmark on all this. I&amp;#8217;m estimating we should now be able to do over 50000 &amp;#8220;real&amp;#8221; queries/second on a 3GHz P4, which would put us an order of magnitude above the open source competition, and even beat, by a large factor, the numbers I hear quoted for commercial alternatives. These are hard to compare as their numbers are under NDA.&lt;/p&gt;

&lt;p&gt;It might not even be easy to generate that much testing data..&lt;/p&gt;

&lt;p&gt;Will keep you posted!&lt;/p&gt;</description>
      <pubDate>Thu, 14 Dec 2006 22:21:00 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:5a67e3b9-1e38-4bf5-8851-4111e50a61dc</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2006/12/14/powerdns-speedups</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Netherlabs</category>
    </item>
    <item>
      <title>The joys of mixing C and C++</title>
      <description>&lt;p&gt;Many thanks to my brother who read my previous post and promptly offered to procure new disks for me, they are now in production. Thanks Jaap!&lt;/p&gt;

&lt;h2&gt;C &amp;amp; C++&lt;/h2&gt;

&lt;p&gt;One of the things that is easy to forget about C++ is that, while not (really) a superset of C, it does offer the ability to call C functions from C++, and makes some pretty strong statements about the abilty to exchange data between the two languages.&lt;/p&gt;

&lt;p&gt;C++ does not come with a set of &amp;#8216;foundation classes&amp;#8217;, and while the &amp;#8220;standard template library&amp;#8221; is strong on data structures, and algorithms to manipulate them, nothing is offered in the way of network communications infrastructure.&lt;/p&gt;

&lt;p&gt;Many attempts have been made to rectify this situation, but these tend to be somewhat heavy handed, or overly complex.&lt;/p&gt;

&lt;p&gt;Enter the &lt;code&gt;ComboAddress&lt;/code&gt;. This C++ union is laid out in memory just like the venerable &lt;code&gt;struct sockaddr_in&lt;/code&gt;, and through its second member, also just like &lt;code&gt;struct sockaddr_in6&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;The upshot is that we have a C++ union with interesting methods, that allows us to specify destination addresses, either IPv4 or IPv6, with ease, but that can also be passed to the standard Berkeley C socket functions! &lt;/p&gt;

&lt;p&gt;These functions promptly forget they are passed a C++ union, and interpret their argument as a &lt;code&gt;struct sockaddr&lt;/code&gt; family member.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;   int sock = socket(AF_INET, SOCK_STREAM, 0);
    ComboAddress ca("127.0.0.1", 6666);
    if (connect(sock, ca) &amp;lt; 0)
            unixDie("connecting to server");
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;&amp;#8216;unixDie()&amp;#8217; is a simple function that uses strerror to throw a runtime_error with a descriptive error message.&lt;/p&gt;

&lt;p&gt;If you are really paying attention, you might have noticed that the &amp;#8216;connection&amp;#8217; function above is not a real C function, and you would be right. It is a very thin wrapper that saves some typing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;  inline int connect(int fd, const ComboAddress&amp;amp; remote)
  {
          return connect(fd, (struct sockaddr*) &amp;amp;remote, remote.getSocklen());
  }
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;Another example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;  int fd = accept(sock, &amp;amp;ca);
  if(fd &amp;gt;= 0)
           cout &amp;lt;&amp;lt; "Connection from " &amp;lt;&amp;lt; ca &amp;lt;&amp;lt;endl;
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;The tiny bit of code that makes up the ComboAddress can be found in the PowerDNS Recursor source code. I find that it nicely bridges the vast power of the Berkeley sockets API, while taking a lot of the tedium out of calling the host of functions needed to convert between printable IP addresses, port numbers, and the actual stuff the sockets API expects.&lt;/p&gt;

&lt;p&gt;And this is all possible because a bunch of guys with serious &amp;#8216;Unix beards&amp;#8217; decided that C and C++ should remain family members. Thanks!&lt;/p&gt;</description>
      <pubDate>Thu, 12 Oct 2006 21:55:00 +0200</pubDate>
      <guid isPermaLink="false">urn:uuid:8c25c6bb-2e4f-474b-a63c-8e456efdab45</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2006/10/12/the-joys-of-mixing-c-and-c</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Netherlabs</category>
    </item>
    <item>
      <title>Disk died, RIPE report</title>
      <description>&lt;p&gt;Well, I reported previously that the server that powers this blog fell 9 feet, and appeared to have survived? Since that event, one of the disks reported odd errors every once in a while, but those appeared to point to a bad cable. I replaced it, but no joy, problems remained.&lt;/p&gt;

&lt;p&gt;So tonight I decide to back up that disk completely, and take it out of use. And lo, &lt;em&gt;during&lt;/em&gt; the backup it decides to pack up! It made a noise like a passing moped, and ceased to work. Backup was almost entirely done.&lt;/p&gt;

&lt;p&gt;I restored the backup to another computer and mounted it via NFS (over wifi no less!), and things (including this blog) are back in production again. I&amp;#8217;ll have to buy new disks ASAP though. &lt;/p&gt;

&lt;h2&gt;PowerDNS RIPE presentation&lt;/h2&gt;

&lt;p&gt;RIPE was lots of fun, although my presentation did not go as well as I&amp;#8217;d hoped. I&amp;#8217;ve been distracted by grave medical problems in my family, which mean that I spend a lot of my time in the hospital. It might&amp;#8217;ve been better to not do the presentation. Some people did tell me they enjoyed it though. Oh well.&lt;/p&gt;

&lt;p&gt;For the first time, I&amp;#8217;ve had the pleasure of answering a question from a webcam viewer! RIPE offers the great service that remote attendants can ask questions over IRC or Jabber, and a RIPE employee will then relay the question. A tremendous service!&lt;/p&gt;

&lt;p&gt;Lunch at RIPE was fantastic, and it was very nice to meet many friends again. All in all a good day.&lt;/p&gt;</description>
      <pubDate>Sat, 07 Oct 2006 00:52:00 +0200</pubDate>
      <guid isPermaLink="false">urn:uuid:42134c4d-f2be-4400-869b-ef60480605d1</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2006/10/07/disk-died-ripe-report</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Life</category>
    </item>
    <item>
      <title>RIPE 53, PowerDNS tidbits</title>
      <description>&lt;p&gt;Quick post to say that at &lt;a href="http://www.ripe.net/ripe/meetings/ripe-53/"&gt;RIPE 53&lt;/a&gt;, I&amp;#8217;ll be presenting about the PowerDNS Recursor and specifically its implementation of my &lt;a href="http://www.ietf.org/internet-drafts/draft-hubert-dns-anti-spoofing-00.txt"&gt;Internet-Draft&lt;/a&gt; (&amp;#8220;Draft RFC&amp;#8221;).&lt;/p&gt;

&lt;p&gt;More details in &lt;a href="http://mailman.powerdns.com/pipermail/pdns-users/2006-October/003806.html"&gt;this post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are at RIPE, come and say hello, or have an excellent Krasnapolski lunch with me!&lt;/p&gt;

&lt;h2&gt;Long standing bugs&lt;/h2&gt;

&lt;p&gt;Over the past few weeks some very long standing &amp;#8220;low level irritation&amp;#8221; PowerDNS bugs have been fixed. One of the things you learn during the maturation of software projects is that things are good once you start to get reports of obscure bugs, as this means that the big problems are out of the way!&lt;/p&gt;

&lt;p&gt;Predictably, the bugs were related to the handling of rare errors, which also reinforces my belief that error handling of rare bugs tends to be very buggy, as these paths rarely get exercised, and when they do, people often don&amp;#8217;t even notice the problem is more in the handling than in the error.&lt;/p&gt;

&lt;p&gt;Don&amp;#8217;t try to be too smart when dealing with errors!&lt;/p&gt;</description>
      <pubDate>Sun, 01 Oct 2006 14:47:00 +0200</pubDate>
      <guid isPermaLink="false">urn:uuid:58f338f7-6121-4eb6-a800-7ac0f6850b48</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2006/10/01/ripe-53-powerdns-tidbits</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Netherlabs</category>
    </item>
    <item>
      <title>Odds &amp;amp; ends</title>
      <description>&lt;p&gt;Quick update on some small things. &lt;/p&gt;

&lt;h2&gt;PowerDNS&lt;/h2&gt;

&lt;p&gt;I managed to release PowerDNS Recursor 3.1.3 which must rank as one of the most succesful releases of PowerDNS ever, as I have had zero feedback, despite a large number of downloads. Most big deployments have switched over. There is still a very small trickle of odd crashes though, but they are so rare it is hard to pin it down to anything. &lt;/p&gt;

&lt;h2&gt;Wireless&lt;/h2&gt;

&lt;p&gt;Our new house has a lot going for it, except wiring possibilities. It might be possible to improve this, but right now I want nothing but the best and I&amp;#8217;m not prepared to soil my house with badly laid cables. So it has to be wireless, which for fixed computers mostly means USB. After some searching and experimenting, I can report that zd1211 derived devices work really well using the Linux zd1211rw driver. Wireless reception depends a lot on RF conditions, having a USB receiver on a cable means you can move it around for the best reception.&lt;/p&gt;

&lt;p&gt;The nice thing about the ZD1211 derived devices (I have two 3Com OfficeConnect adaptors) is that the authors of the driver are very approachable and work well with (and are in fact part of) the Linux kernel community. Unlike some.&lt;/p&gt;

&lt;h2&gt;New house&lt;/h2&gt;

&lt;p&gt;It still rocks, although we haven&amp;#8217;t had much time to empty the last boxes and buy furniture that matches the quality of the house. Sadly, we are spending a lot of time in the hospital and taking care of related things. &lt;/p&gt;</description>
      <pubDate>Thu, 21 Sep 2006 23:16:00 +0200</pubDate>
      <guid isPermaLink="false">urn:uuid:493b3c21-e7d6-4f68-ae2c-25d521731b3d</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2006/09/21/odds-ends</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Netherlabs</category>
      <category>Life</category>
    </item>
    <item>
      <title>The perfect error message</title>
      <description>&lt;p&gt;Programming is a lot of things. One of them is writing good error messages. We tend to think that errors are rare and this should of course be so. However, sometimes they are not, and in that case, good reporting is vital to  quickly resolve the problem.&lt;/p&gt;

&lt;p&gt;So, even though we should make sure errors do not happen, if they do, our error messages should be top notch.&lt;/p&gt;

&lt;h1&gt;About error messages&lt;/h1&gt;

&lt;h2&gt;Purpose &lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;For operators, they are vital aids in configuring software&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For system adminstrators, they show which external problems need to be resolved (disk full, network down, etc).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Should a program crash, the authors often only have error messages as clues to why this happened. Many crashes are preceded by errors that are reported. A good error can help generate a bug fix.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Taxonomy of error messages&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Configuration problems, for example, looking for a file in directory A while it resides in directory B;&lt;/li&gt;
&lt;li&gt;Unavailable resources (disk full, out of memory), connectivity problems;&lt;/li&gt;
&lt;li&gt;Should Never Happen.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;Configuration problems&lt;/h2&gt;

&lt;p&gt;These commonly occur while software is being installed and setup. Good error reporting is of utmost importance here, as it serves two purposes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Educating the operator about how the program functions;&lt;/li&gt;
&lt;li&gt;Explaining what needs to be fixed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Ad 1, an error like &amp;#8220;Can&amp;#8217;t start frobnicator because the discombulator is not running&amp;#8221; teaches the operator that a frobnicator needs a discombulator. This knowledge can of course also be gleened from the documentation, but in this case, repetition is a good thing. &lt;/p&gt;

&lt;p&gt;Compare this error to &amp;#8220;Can&amp;#8217;t start process: Connection refused&amp;#8221;, for example, and think about how helpful that is.&lt;/p&gt;

&lt;p&gt;Ad 2, a report like &amp;#8220;Can&amp;#8217;t connect to product database using connection string &amp;#8216;dbuser=john, dbname=changeme&amp;#8217;: No such database&amp;#8221; immediately tells the operator what he needs to know.&lt;/p&gt;

&lt;h2&gt;Unavailable resources&lt;/h2&gt;

&lt;p&gt;These typically occur while a program is already running and installed, but are nonetheless important. Do not log &amp;#8216;Disk full&amp;#8217;, but report &amp;#8216;Disk full writing to &amp;#8230;.&amp;#8217; so the operator knows which disk filled up. If a server could not be reached, log the IP address and possibly the name of the server. Any discrepancy between the two may point out a DNS configuration error.&lt;/p&gt;

&lt;p&gt;Out of memory is typically very hard to deal with, except when something really odd was going on. A typical example is trying to allocate a ridiculous amount of memory because of an earlier error, in which case logging what memory was being allocated for might help debug the problem. It typically will not help the user of a program.&lt;/p&gt;

&lt;h2&gt;Should never happen&lt;/h2&gt;

&lt;p&gt;These are errors of the program itself. Programmers quite often test for impossible conditions, especially if they sense these might conceivably happen one day. An example might be a module that can only connect to IPv4 servers that gets confronted by an IPv6 socket, which in turn leads calls to determine the IPv4 remote address to fail. &lt;/p&gt;

&lt;p&gt;It is tempting to be quite rude in these messages, or say stuff like &amp;#8220;should never happen!!&amp;#8221;, but resist these urges. One day a &amp;#8216;should not happen&amp;#8217; error is going to be a vital debugging clue. These errors are rare, but it pays to go, well, the extra few yards to perhaps report &amp;#8220;unexpected address family accepting connection!&amp;#8221;.&lt;/p&gt;

&lt;h1&gt;Guidelines&lt;/h1&gt;

&lt;p&gt;An error message should contain:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Who is reporting the error (which subsystem, which program, which module)&lt;/li&gt;
&lt;li&gt;The action that failed&lt;/li&gt;
&lt;li&gt;The subject of the action (a directory, a server, a port number, an IP address)&lt;/li&gt;
&lt;li&gt;As good an indication of the actual error as possible.&lt;/li&gt;
&lt;li&gt;Optional - what the program is doing about it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;An excellent error message is:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Webserver can't serve page, error opening file '/var/www/index.html': Permission denied, reporting HTTP 404 error&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Ad 1, it is tempting to include filenames, function names, and line numbers here. OpenSSL does this a lot, for example. However, almost none of your intended audience will be able to extract useful information from the fact that the error occurred on line 123 of &amp;#8216;multiplexer.c&amp;#8217;. Make sure the module means something to the operator. It may be as simple as the name of your program.&lt;/p&gt;

&lt;p&gt;Ad 2, this helps the operator determine if this error might be the explanation of observed problems. An error like &amp;#8220;Webserver failed to increase TCP buffer size, continuing with default&amp;#8221; can immediately be ruled out as an explanation for why people can&amp;#8217;t log in to their mail.&lt;/p&gt;

&lt;p&gt;Ad 3, an error like &amp;#8220;Can&amp;#8217;t open file&amp;#8221; on its own can mean many things. One of which might be that it is not reading the configuration file you think it is, and trying to open your index.html in &amp;#8216;/usr/local/www&amp;#8217;, and not in &amp;#8216;/var/www&amp;#8217; like you thought you configured. &lt;/p&gt;

&lt;p&gt;Ad 4, self explanatory. Take the trouble to convert error codes into strings. Many programmers may know what &amp;#8216;errno = 111&amp;#8217; means &amp;#8216;Connection refused&amp;#8217;, but don&amp;#8217;t count on your users knowing this.&lt;/p&gt;

&lt;p&gt;Ad 5, this is a fine counterpoint to item 4. &amp;#8220;Giving a 404&amp;#8221; is very clear for any operator of a web server. Not all errors need a followup, so reporting what the program is doing about the error is not mandatory.&lt;/p&gt;

&lt;h1&gt;Conclusion&lt;/h1&gt;

&lt;p&gt;Good error messages can save your users many days of problems. And, suprisingly, you might yourself gain even more time - how well do you know the internals of your program after a few months? &lt;/p&gt;

&lt;p&gt;So please please, both as a user and a progammer, I ask you, spend time on error messages!&lt;/p&gt;</description>
      <pubDate>Fri, 08 Sep 2006 15:36:00 +0200</pubDate>
      <guid isPermaLink="false">urn:uuid:34c6bf3d-0bc1-4c52-84c7-65b39cdf8b6e</guid>
      <author>bert.hubert@netherlabs.nl (bert hubert)</author>
      <link>http://blog.netherlabs.nl/articles/2006/09/08/the-perfect-error-message</link>
      <category>Linux</category>
      <category>PowerDNS</category>
      <category>Netherlabs</category>
    </item>
  </channel>
</rss>
