Posted by bert hubert
Tue, 27 Jan 2009 21:56:00 GMT
Hi everybody!
What a day! Remco van Mook and I received a message today that our RFC Draft (full text here) has entered the ‘AUTH48’ stage. This means that it has been assigned a number (RFC 5452!), and that barring meteor strikes or similar things, we are now finally done. Yay!
We spent 2 years and 9 months on this. It felt like even more. I’ve been told the draft has already made a difference in some places - from now on, DNS implementations that have certain bad spoofing behaviour MUST
clean up their act :-)
In short, had this RFC been followed, the whole Kaminsky DNS scare could have been prevented. Do note that the draft is 2 years older than Kaminksy’s discovery. The DNS community should have listened to Dan Bernstein *10* years ago.
Some more thoughts on this subject can be found here. I’m slightly bitter.
As if the RFC weren’t enough excitement for one day, I also released PowerDNS Authoritative Server 2.9.22, the first release of the authoritative server in almost 20 months. Because of this long delay, a lot of effort was spent field testing this release before it ‘went gold’ (to use an expression I really despise).
I sincerely hope we shook out most of the bugs. The PowerDNS community really delivered, and many of our enthusiastic users deployed pre-release code on their significant installations, to make sure everybody else would be able to upgrade with confidence.
Read the whole store here.
Posted in PowerDNS | 3 comments
Posted by bert hubert
Sun, 16 Nov 2008 21:21:00 GMT
- Idea - estimates for time to completion range from 3 days to 3 weeks
- Pretty convincing first stab ‘look how cool this would be’
- The Hard Slog to get something that actually works. Estimates now range from 3 months to 3 years.
- First real users pop up, discovery is made that all assumptions were off
- Starts to look good to the first real user
- Elation!
- Someone actually uses the code it for real, the bugs come out in droves
- A zillion bugs get addressed, harsh words are spoken
- Elation!
- The guy you had previously told that 100 million users would not ‘in principle’ be a problem actually took your word for it, and deployed it on said user base. Harsh words are spoken.
- Fundamentals are reviewed, large fractions of the code base reworked
- Product now actually does what everybody hoped it would do.
- Even very unlikely bugs have cropped up by now, and have been addressed. Even rare use cases are now taken into account.
- If a user complains of a crash at this stage, you can voice doubts about the quality of his hardware or operating system.
PowerDNS went through all these stages, and took around 5 years to do so.
Not all parts are at ‘stage 14’ yet, but for the Recursor, I seriously ask people to run ‘memtest’ if they report a crash.
The above 14 points are never traversed without users that care. For PowerDNS, step ‘4’ was performed by Amaze Internet and step ‘7’ by ISP Services. 1&1 (called Schlund back then) was instrumental in step ‘10’ when they started using it on millions of domains.
For the PowerDNS Recursor, steps ‘4’ and ‘7’ not only happened over at XS4ALL, but they also paid for it all!
Step ‘10’ occurred over at AOL and Neuf Cegetel, who together connected the Recursor to 35 million subscribers or so.
Finally, the parts of PowerDNS that have reached the end of the list above have done so because of literally hundreds if not thousands of operators that have made the effort to report their issues, or voice their wishes.
Many thanks to everybody!
Hmm, the above does not sound very professional..
I’ve heard the theory that some people think they can plan software development more professionally. I used to believe them too. But any real project I’ve heard of went through the stages listed above. No schedule, no Microsoft Project sheet, no Gantt Chart I know about ever even came close to reality.
But I’d love to be wrong, because I agree fully that it would be great if software development was more predictable.
This is especially true since the aforementioned “process” necessarily involves several very committed users, who have to voice the harsh words, but do have to stick with the project.
So please comment away if your real life experiences are different - I’d love to hear!
Posted in PowerDNS, Netherlabs | no comments
Posted by bert hubert
Thu, 18 Sep 2008 19:44:00 GMT
After too much posting on IETF mailing lists, and not achieving anything, I’ve gone back to coding a bit more.
There are two things I want to share - the first because I had a devil of a time figuring out how to do something, and I hope that posting here will help fellow-sufferers find the solution via Google. The second thing I want to talk about because, and this is getting to be rare, I programmed something cool, and I just need to tell someone about it.
I pondered explaining it to my lovely son Maurits (4 months old today), but I don’t want to ruin his brain.
Debugging iterators
In most programming languages there are a lot of things that compile just fine, or generate no parsing errors at runtime, but are still accidents waiting to happen.
Tools abound to expose such silent errors, usually at a horrendous performance cost. But this is fine, as errors can be found by the developer, and fixed before release.
As part of our arsenal, we have the veritable Valgrind that detects things such as reading from memory that had not previously been written to. In addition, other tricks are available, such as changing functions that ‘mostly do X, and rarely Y’ so that they always to Y. This quickly finds programs that skipped dealing with Y (which might be a rare error condition, or realloc(2) returning a new memory address for your data).
Finally, many programming environments by default perform very little checking (in the interest of speed) - for example, they will gladly allow you to compare something that points to data in collection A to something that points to collection B - a comparison that never makes sense, classical apples and oranges.
My favorite C++ compiler, G++, comes with so called ‘debugging iterators’ that are very strict - anything that is not absolutely correct becomes an error, sometimes at compile time, sometimes at runtime.
Together with Valgrind, this is one of the techniques I like to whip out when the going gets tough.
Sadly, Debugging iterators (which are turned on by adding -DGLIBCXXDEBUG conflict with one of my favorite C++ libraries, Boost.
To make a long story short, to compile a version of Boost with debugging iterators, issue:
$ bjam define=_GLIBCXX_DEBUG
This single line of text may not look all that important, but it took me half a day of debugging to figure this out. So if you get this error:
dnsgram.o:
(.rodata._ZTVN5boost15program_options11typed_valueISscEE[vtable for
boost::program_options::typed_value, std::allocator >, char>]+0x18):
undefined reference to
`boost::program_options::value_semantic_codecvt_helper::parse(boost::any&,
std::__debug::vector,
std::allocator >, std::allocator, std::allocator > > > const&, bool) const'
Then compile your own version of boost as outlined above.
C++ Introspection & Statistics
C++ is an old-school language, perhaps the most modern language of the old school. This means that it sacrifices a lot of things to allow programs to run at stunning ‘near bare metal’ speeds. One of the things that C++ does not offer therefore is ‘introspection’
What this means is that if you have a class called “ImportantClass”, that class does not know its own name at runtime. When a program is running, it is not possible to ask by name for an “ImportantClass” to be instantiated.
If you need this ability, you need to register your ImportantClass manually by its name “ImportantClass”, and store a pointer to a function that creates an ImportantClass for you when you need it.
Doing so manually is usually not a problem, except of course when it is. In PowerDNS, I allocate a heap (or a stack even) of runtime statistics. Each of those statistics is a variable (or a function) with a certain name.
In more modern languages, it would probably be easy to group all these variables together (with names like numQueries, numAnswers, nomUDPQueries etc), and allow these statistics to be queried using their names. So, an external program might call ‘get stat numQueries’, and PowerDNS would look up the numQueries name, and return its value.
No such luck in C or C++!
So - can we figure out something smart, say, with a macro? Yes and no. The problem is that when we declare a variable in C which we want to be accessible from elsewhere in the program, it needs to happen either inside a struct or class, or at global scope. This in turn means that we can’t execute code there. So, what we’d like to do, but can’t is:
struct Statistics {
uint64_t numPackets;
registerName(&numPackets, "numPackets");
uint64_t numAnswers;
registerName(&numAnswers, "numAnswers");
} stats;
stats.numPackets is indeed available, but the line after its definition will generate an error. This is sad, since the above could easily be generated from a macro, so we could do:
DEFINESTAT(numPackets, “Number of packets received”);
Which would simultaneously define numPackets, as well as make it available as “numPackets”, and store a nice description of it somewhere.
But alas, this is all not possible because of the reasons outlined above.
So - how do we separate the data structure from the ‘registerName()’ calls, while retaining the cool ‘DEFINESTAT’ feature where everything is in one place?
In C++, files can be included using the #include statement. Most of the time, this is used to include so called ‘header’ files - but nothing is stopping us from using this feature for our own purposes.
The trick is to put all the ‘DEFINESTAT’ statements in an include file, and include it not once, but twice!
First we do:
#define DEFINESTAT(name, desc) uint64_t name;
struct Statistics {
#include "statistics.h"
} stats;
This defines the Statistics struct, containing all the variables we want to make available. These are nicely expanded using our DEFINESTAT definition.
Secondly, we #undefine DEFINESTAT again, and redefine it as:
#define DEFINESTAT(name, desc) registerName(&stats.name, #name, desc);
Then we insert this in a function:
void registerAllNames()
{
#include "statistics.h"
}
This will cause the same statistics.h file to be loaded again, with the same DEFINESTAT lines in there, but this time DEFINESTAT expands to a call that registers each variable, its name (#name expands to “name”), and its description.
The rest of our source can now call ‘stats.numPackets++’, and if someone wants to query the “numPackets” variable, it is available easily enough through its name since it has been registered using registerName.
The upshot of this all is that we have gained the ability to ‘introspect’ our Statistics structure, without any runtime overhead nor any further language support.
As stated above, more modern languages make this process easier.. but not as fast!
I hope you enjoyed this arcane coolness as much as I did. But I doubt it :-)
Posted in Linux, PowerDNS, Netherlabs | no comments
Posted by bert hubert
Wed, 09 Jul 2008 19:31:00 GMT
Yesterday it was announced that there is an unspecified but major DNS vulnerability, and that Microsoft, Nominum and ISC had fixes available.
It is amusing to note that this has been hailed as a major feat of cooperation, with the vulnerable parties spinned as being part of secret industry cabal that has just saved the world from very bad things.
To say the least, I find this a funny way of presenting things! The vulnerability is still not public, but the secret cabal shared it with me. Perhaps it is fair to say I am part of the cabal - I nearly traveled to the secret meeting at the Microsoft campus, but the imminent birth of my son made me decide not to go.
The DNS vulnerability that has been presented yesterday is indeed a very serious problem, and I am glad steps are now taken to fix the broken software that was vulnerable. Dan Kaminksy is to be praised for discovering the issue and coordinating the release.
However - the parties involved aren’t to be lauded for their current fix. Far from it. It has been known since 1999 that all nameserver implementations were vulnerable for issues like the one we are facing now. In 1999, Dan J. Bernstein released his nameserver (djbdns), which already contained the countermeasures being rushed into service now. Let me repeat this. Wise people already saw this one coming 9 years ago, and had a fix in place.
In 2006 when my own resolving nameserver entered the scene, I decided to use the same security strategy as implemented in djbdns (it is always better to steal a great idea than to think up a mediocre one!). Some time after that, I realised none of the other nameservers had chosen to do so, and I embarked on an effort to move the IETF DNS-EXT working group to standardise and thus mandate this high security behaviour.
This didn’t really go anywhere, but some months ago I noticed particularly strenuous resistance in the standardisation of the so called ‘forgery resilience draft’, and after some prodding it became clear it was felt my draft was in danger of drawing attention to the then unannounced DNS vulnerability, and that it were best if we’d all shut up about it for a few months, perhaps until July 2008 until all the vendors would have had time to get their act together.
And now we’ve seen the release, and it is being hailed as great news. But it isn’t. Dan Bernstein has been ignored since 1999 when he said something should be done. I’ve been ignored since 2006. The IETF standardisation languished for two years.
This is not a success story. It has in fact been a remarkable failure.
To end on a positive note - I am very glad Dan Kaminsky’s work caused some collective eye opening, and I hope good things come from this. DNS has long lacked critical attention, and in the end this might bring about sorely needed improvements.
DNS very recently celebrated its 25th birthday - I look forward to seeing the venerable Domain Name System succeed in the coming 25 years!
Posted in PowerDNS, Netherlabs | 8 comments
Posted by bert hubert
Sun, 11 Nov 2007 10:36:00 GMT
While running the risk of turning this blog into a lecture series, I can’t
resist. This post will dive into cryptography, and I hope to be able to
transfer the sense of wonder that caught me when I first read about Diffie-Hellman
key exchange many years ago.
Let’s assume you are in a room with two other people, and that you want to
share a secret with one of them, but not with the other. In the tradition of
cryptography, we’ll call these three people Alice (you), Bob (your friend)
and Eve (who wants to ‘Eavesdrop’ on your secrets).
Let’s also assume that the room is very quiet, so you can’t whisper, and
everybody can hear what everybody else is saying. Furthermore, you are far enough away that you can’t pass paper messages.
So how could you (Alice) share a secret with Bob? Anything you want to tell
Bob, will be overheard by Eve. You might try to think up a code, but you’ll
still have to explain the code, and both Bob and Eve will hear it.
It turns out that using the magic of public key cryptography, this is
possible - sharing a secret while people are listening in.
The room with Alice, Bob and Eve is not a very relevant example, but replace
Alice by ‘The allied forces’, ‘Bob’ by a resistance fighter equipped with a
radio, and ‘Eve’ by the occupying enemy, and things start to make sense.
Or, in today’s terms, replace Bob by Amazon.com, and Eve by a hacker
interested in getting your credit card number.
So how does it work?
To send a secret, two things are needed: an ‘algorithm’ and a ‘key’. A famous
algorithm is the ‘Caesar cypher’, which consists of shifting all letters by
a fixed amount. So an A might become a B, a B would become a C etc etc.
The key in this case is how much you want to shift the letters, in the
sample above the key is ‘1’. If the key had been ‘2’, an A would’ve become a
C, a B would’ve become a D etc.
Typically, you can discuss the algorithm in public, but you need to keep the
key secret. In terms of Alice and Bob, they will be able to communicate in
secret once they’ve been able to establish a key that Eve does not know
about.
Once everybody has agreed to use the Caesar cypher, the problem shifts to
exchanging how many letters we will shift. We can’t just say this out loud,
since both Bob and Eve will hear it.
Diffie-Hellman
Way back in 1976, Whitfield Diffie and Martin Hellman published the details
of what has become known as the Diffie-Hellman key exchange algorithm,
although they both credit Ralph Merkle with some of the key ideas.
The process basically works as follows. Alice and Bob each think of a random
number, that they keep a secret. Then they both do some calculations based
on this number, and say the result of those calculations out loud.
Then both Alice and Bob combine the results of the calculations with their own
secret random number, and out pops a shared random secret number. This
shared random secret number is now known by Alice and Bob, but not by Eve.
And it is this secret that now becomes the key.
How is this possible?
Eve heard both Alice and Bob say a random number, exactly the same numbers
that Alice and Bob heard. Yet only Alice and Bob now know the shared secret.
How is this possible?
The trick lies in the calculation, by which means Alice and Bob only shared
parts of the numbers they chose initially. Then both Alice and Bob combined
those parts with their full random numbers.
It is this trick of revealing only parts of random numbers, and then
combining the part of the other party with your full number, that leads to a
shared secret.
Show me
On this page I wrote a very simple Diffie-Hellman example program that runs entirely within your web browser. You can either use it alone, or with a friend - which is the most fun. It works over the phone, or over an instant messenger (IRC, MSN etc). Follow the instructions, encode a message, paste it to your friend, and if your friend followed the instructions, and he pastes the encoded message into the decoder, he should see your secret message!
This is even more fun in a chat room with actual Eve’s present.
Please be aware that the sample is a joke - don’t use it to share real secrets! However, the technology it employs is real, and this truly is how people exchange keys - only the numbers are far larger (300 digits), and the actual encryption is not a Caesar cypher.
So how does it really work?
More information can be found on the wikipedia page about Diffie-Hellman, especially in the ‘external links’ section.
Posted in PowerDNS | 1 comment
Posted by bert hubert
Wed, 21 Feb 2007 22:13:00 GMT
Enjoyed a fun and stimulating “DNS & Crypto Power Lunch” with Dan Bernstein (left) and Tanja Lange (not in picture). As was to be expected, the intersection of cryptography and (secure) DNS was discussed, and some evil plans might ensue! If implemented in djbdns and PowerDNS, we might actually achieve something..
Posted in PowerDNS, Netherlabs, Life | no comments
Posted by bert hubert
Thu, 08 Feb 2007 21:39:00 GMT
Just a quick note that I’ll be presenting at The future of VoIP 2 event as organised by the Internet Society of The Netherlands, part of the (global) “Internet Society”.
The event takes place on the 15th of March, in The Hague. For more details, see the links above.
As always, I love to meet PowerDNS users, or in fact, anybody interested in doing interesting things with DNS. So should you be there, it would be good to talk.
Posted in PowerDNS | no comments
Posted by bert hubert
Fri, 12 Jan 2007 21:16:00 GMT
The workings of the Internet are described, or even proscribed, by the so called ‘Requests For Comments’, or RFCs. These are the laws of the internet.
Today the IETF DNS Extensions working group accepted an “Internet-Draft” Remco van Mook and I have been working on. And the cool bit is that over time, many such accepted “Internet-Drafts” turn into RFCs!
Read about it what our draft does
here
and here.
The actual Internet-Draft can be found over at the IETF, or over here as pretty HTML.
In short, this RFC documents and standardises some of the stuff DJBDNS and PowerDNS have been doing to make the DNS a safer place.
Besides the fact that it is important to update the DNS standards to reflect this practice, it is also rather a cool thought to actually be writing an RFC, especially one that has the magic stanzas “Standards Track” and “Updates 1035” in it.
So we are well pleased! Over the coming months we’ll have to tune the draft so it confirms with the consensus of the DNSEXT working group, and hopefull somewhere around March, it will head towards the IESG, after which an actual RFC should be issued.
Exciting!
Posted in Linux, PowerDNS, Netherlabs, Life | 10 comments
Posted by bert hubert
Mon, 01 Jan 2007 15:58:00 GMT
I wish everybody a very good 2007! For PowerDNS, it certainly has been a very good year.
In some (large) places, the Recursor now commands a 40% market share, while the authoritative server is also expanding its user base around the world, with multi-million domain deployments now no longer as newsworthy as they once were.
The Chaos Computer Club held its annual congress last week, and they chose the PowerDNS Recursor to provide the DNS service to go with their 10 gigabit connection. I’m pleased to report that the PowerDNS process was fired up only once, and that it held steady for the entire congress, with no complaints. This would usually not be that strange, but the CCC clientèle are among the most critical internet users to be found on the planet.
Many thanks to Stefan Schmidt and other CCC admins for their vote of confidence!
Rails
I’m working on understanding ‘Ruby on Rails’, which will probably end up as a HOWTO aimed at seasoned programmers. The internet abounds with “you won’t believe how easy Ruby on Rails is” demonstrations, but the hard truth is that below the surface, a lot of magic is happening. The kind of magic the discerning programmer wants to grasp so as to make the most of it.
A very small start to this HOWTO can be found here.
It may also allow experience programmers to teach themselves Ruby in less time than it would take them to read a 750 page book.
Posted in Linux, PowerDNS, Netherlabs, Life | 7 comments
Posted by bert hubert
Thu, 14 Dec 2006 21:21:00 GMT
After PowerDNS 3.1.4 turned out to be boringly stable, fixing all reported crashes, I decided it was time to do the massive speedup I’d been promising people for some time.
With some help from my friends over at #offtopic2, I was able to use the TSC register of my CPU to measure down to the nanosecond how much time things were taking within PowerDNS. Previously I’d concentrated on profiling macro performance, but nanosecond resolution allows one to study fully how much time is spent within each function.
Using this technique, it became apparent we take a whopping 60 microseconds to answer even the most basic of questions. We make up for this by being pretty fast at complicated questions. But 60 microseconds mean we are limited to about 15000 questions/second, max.
First I started shaving microseconds. It turns out snprintf
is truly slow, taking up to 5 microseconds for some strings. Additionally, we wasted a lot of time on needlessly copying std::strings
.
The unsurpassed Boost::Multi_Index container has a spectacular feature, called ‘compatible keys’, which means we can lookup answers using a question key that is a bare piece of memory instead of a proper std::string
. This again saved a few microseconds.
Put together, this brought down the 60 usec to perhaps 40, which is nice, but not stunning.
But the big savings only came when I did the only thing that actually makes code fast: do less.
So - when encoding the answer to a question, we no longer do the whole “DNS label compression”-routine, as we know the “label” of the answer to a question can always be encoded as the fixed bytes 0xc00c - we don’t need to calculate it.
Going beyond that, when generating a simple answer, don’t generate an answer packet, but simply tack on the answer to the original question, and update the ‘answer count’.
Also, if we see we have an ‘instant answer’ available for a question, don’t bother to launch a whole ‘MThread’ to generate it, but return synchronously.
The upshot of all this is that we can now answer most questions in… 4 microseconds, down from 60. 15-fold speedups are rather rare usually.
We didn’t speedup everything that much though, only the majority of queries. However, even the uncached queries will benefit from the microsecond shaving performed earlier, and run around twice as fast.
I can’t wait to do a live benchmark on all this. I’m estimating we should now be able to do over 50000 “real” queries/second on a 3GHz P4, which would put us an order of magnitude above the open source competition, and even beat, by a large factor, the numbers I hear quoted for commercial alternatives. These are hard to compare as their numbers are under NDA.
It might not even be easy to generate that much testing data..
Will keep you posted!
Posted in Linux, PowerDNS, Netherlabs | 1 comment