Sun T2000, PowerDNS, 'Completion Ports'

Posted by bert hubert Wed, 19 Apr 2006 15:40:00 GMT

Powered up the Sun CoolThreads T2000 again today to work on first class Solaris support for PowerDNS. As mentioned previously, because of the new anti-spoofing measures, PowerDNS needs to listen to hundreds or sometimes thousands of sockets. The traditional UNIX approach was to tell the kernel which sockets hold your interest, call select(2), and look at the sockets it tells you are active. And then you have to do the whole thing again, ie, report all sockets to the kernel again.

All modern UNIXes come with a better solution: tell kernel which sockets hold your interest, ask kernel which are active, do work, ask kernel which are active. In other words, there is no need to setup everything for each packet.

I implemented epoll and kqueue for Linux and FreeBSD yesterday, today I did Solaris completion ports. Some things to note:

  1. A ‘Completion Port’ does not survive fork(2). So create the port after forking.
  2. The port_getn(2) function takes two parameters to specify how many events you want to receive, a minimum and a maximum. This is different from what kqueue and epoll do. The manpage does not make this overly clear. (UPDATE: ok, it does, I can’t read)
  3. Contrary to kqueue and epoll, once you’ve received an event from the port, you need to add back the socket if you are still interested. I think this is a slight optimisation for PowerDNS as the common case is indeed to remove a socket once an event has hit.

To really get all those cores and strands working, I split PowerDNS 24 ways today and tried to benchmark it. This appears to have worked as it fully saturated my fine xs4all ADSL connection :-) If I want to fully test this server I may need to get it hosted somewhere with real bandwidth.

I released the third 3.0 pre-release of PowerDNS today (including .deb and .rpm!), if you have a chance, please test it.

One large deployment has already moved to this release and I’m very pleased to note that due to the multiplexer work, the additional anti-spoofing sockets have no measurable impact on the CPU load.

Posted in , ,  | Tags , ,  | no comments | no trackbacks

PowerDNS Recursor 3.0 Feature Complete!

Posted by bert hubert Tue, 18 Apr 2006 22:26:00 GMT

Ahhhh, I’d thought I’d never make this. As mentioned previously, I’d been trying to convince myself and others for a while now that the PowerDNS recursor was nearly “done”. And I think it is “done” now.

Besides the stuff mentioned below, I added ‘server-id’ today, which allows you to query which exact server in a cluster you are talking to. Additionally, version.bind now also works on the recursor. Of course you can override it!

Then there is rec_control top-remotes which gives a list of the top-20 IP addresses which query you most.

Big things

Went to the in-laws yesterday, which involves a many hour car drive, so I took the excelent Unix Network Programming (volume 1) with me, my brother in law drove most of the way.

I picked up two things:

  1. Connected connectionless sockets
  2. IPv6 address conversion techniques

Connected connectionless sockets

UDP, the protocol DNS uses most of the time, is connectionless, which means that questions are just sent, there is no connection setup or teardown. This has a lot of advantages, but there is also a downside. If we send a packet to a host that is unreachable, or has no nameserver running, we often get an ICMP packet that tells us so. However, as the socket is not connected, the kernel is not in a position to tell userspace about the error: which socket would it be for?

I knew one could connect(2) a UDP socket, but I thought it was just convenience, it saves some typing because there is less need to pass addresses around.

However, UNP informed me that once connected, these ICMP errors do get passed back to userspace. And since I raised the PowerDNS level of security, we have one socket per question anyhow, so we might as well connect.

And lo, when I hooked this up, 5% of outgoing queries now get an error instead of a timeout. This in turn means that we save a 2 second timeout in 5% of outgoing queries, which should translate into a measurable speedup in perceived DNS performance!

I then looked it up, djbdns already does this of course. For me there are enough reasons not to run djbdns/dnscache, but there is a lot of good stuff in there!

IPv6 address conversion

The BSD socket interface is protocol family agnostic, meaning you can use the same interface for IPv4, IPv6, DECNET and whatnot. In practice, this is not as easy as it sounds.

Previously, the PowerDNS recursor had some crude hacks to integrate IPv6, today I sat down to do it well, resulting in the ComboAddress, which is a union of a sockaddr_in and a sockaddr_in6, and has some helper functions to convert it from ‘network’ to ‘presentation’ format. With these functions, it was a breeze to convert the recursor to be fully IPv6 native.

One of the harder things to do was the IPv6 netmask matching code for the allow-from setting, but even that worked rather nicely.

There is a nice quote, often attributed to Antoine de Saint-Exupery:

“Perfection is achieved not when you have nothing more to add, but when you have nothing left to take away.”

I hope the ‘taking away’ part can start in PowerDNS from now on. I already removed the unsafe --single-socket bypass. Although this did have slightly higher performance, it brings down security.

Posted in , ,  | 2 comments | no trackbacks

Storks, epoll, kqueue

Posted by bert hubert Mon, 17 Apr 2006 08:23:00 GMT

We visited my father on holiday last Saturday, hoping to get his wireless internet up and running. Sadly it turns out there are around 3 different antenna connectors and I think around 3 on the computer side of things. The antenna is very spiffy though. Will investigate this week how to best provide wireless - the non plus ultra solution would be a repeater, which would save heaps of cables.

While there, I was tasked with getting some eggs from the chicken coop. Being a city person, I wondered if the chickens would be ok with that and it took some convincing to get me into the cage :-) It suddenly hit me that eggs do not come out of a box but out of the back of a chicken!!! Gross! Oh well.

Then we saw a stork, which is rare enough in The Netherlands. And then another one, until there were four. Impressive birds.

PowerDNS

As mentioned previously, I implemented a lot of anti-spoofing technology in PowerDNS recently. This did come at some operating system cost - listening on so many sockets at the same time exposes shortcomings in the traditional unix select function. Luckily, modern Unixes like FreeBSD, Linux and Solaris all come with replacements, called kqueue, epoll and ports respectively. I wrote a simple multiplexer and implemented support for select, kqueue and epoll, and it all appears to work. I’ll do Solaris once I have a chance.

This also moved all different packet handlers to separate functions, everything previously was within one big loop in main().

With the major exception being the ability to serve from IPv6 (AAAA etcetera works of course), the PowerDNS recursor is now feature complete for 3.0, so I released version 3.0-pre2 (freshmeat).

Posted in , ,  | Tags ,  | no comments | no trackbacks

Holy cow! 1.3 million additional IP addresses served by PowerDNS

Posted by bert hubert Fri, 14 Apr 2006 20:48:00 GMT

Went for dinner yesterday with a friend of mine at one of the Indian restaurants here in Delft. This place is marvelous. It has had a non-working phone number listed on its window for the past 8 years. The music is probably not agreeable even for people from India. The beer tends to taste a bit funny. The entrance is dark, and looks like it has been burgled repeatedly. The staff is clumsy. But the food! Oh my.

Ate too much and went home a bit sleepy.

Surprise email

One of the interesting bits about authoring an open source program is that you know both a lot and nearly nothing about your customers. Sometimes PowerDNS users share everything with me and other developers. I’ve been mailed more root passwords than I care to remember (I have a fully functioning PGP key btw, please use it if you trust me with passwords!).

On the other hand, there are a lot of ‘stealth users’ who don’t come out of the closet. I tend to hear from them only if they hit a problem - which is rare.

So imagine my surprise yesterday when one of the larger access providers in Europe, with a double digit market share in their large country, suddenly announced they’d switched all their nameservers to PowerDNS. 1.3 million additional homes served by my humble code.

I can tell you, that rattles me. Especially since DNS is absolutely 100% vital to using the internet.

So, that inspired me to take the last step in attempting to make PowerDNS the best recursor on the planet.

Spoofing

If you can fool DNS, you can fool a user. DNS is the phone book of the internet, if you manage to give out false data, browsers will head to the wrong servers. Same goes for email. All very bad.

The worse news is that DNS is a breeze to “spoof”, in other words, it is easy to slip in bad data. I set up a somewhat contrived network here today and I was able to spoof both BIND 9 and PowerDNS in less than two seconds. I must admit that the conditions I tested under were highly ideal, but nothing that can’t be achieved in the real world with concerted effort.

And given the huge number of people I now feel responsible for, this is unacceptable.

One of the brightest people I know, Dan J. Bernstein, also writes nameservers. He can be very stubborn and opinionated, but some of his ideas are first rate. You have him to thank (in part) for today’s more liberal cryptography research climate as well. So, I took a lot of inspiration from his work. Read more below.

To spoof a nameserver, one needs to know three things:

  1. Which questions the target nameserver (‘spoofee’) is asking
  2. The exact network end-point it is expecting answers on
  3. The 16-bit ID of the question

You can generally figure out 1) pretty easily, especially if you can force a nameserver to make queries. 2) is easy if the network end-point doesn’t change. 3) can be dealt with by scanning all 65536 ids.

I reduced all three factors today:

  1. I made the PowerDNS recursor default to not accepting questions from the internet at large. This reduces the chances of a spoofer to force questions.

  2. I copied Dan J. Bernsteins system of using a new random network end-point for each question, which means you’ll have to try to guess this end-point too, just like you have to guess the ID. This does put a heavy load on the OS as we now have to listen to perhaps thousands of ports! So I made this optional, but on by default.

  3. If the recursor sees more than 20 failed guesses for the ID, it considers the whole query timed out. I spent a heap of time thinking how to do this elegantly, I had to lie down at one point and close my eyes briefly. This may look like a sinful mid-day nap but don’t let appearances fool you! The solution is to only do the accounting once a packet with a proper ID is in, and deal with it then, and not keep a list of failed guesses.

This was literally the last major piece of PowerDNS that was not ‘best of breed’. Now all I need to do is clean up the code a tad and integrate full IPv6 support, and it should be Perfect.

Wonder what I’ll do then though :-)

Mirjam

I haven’t mentioned my good friend and wife Mirjam enough, and she’s complained a bit. So for the record, Mirjam has been doing a fine job, or at least making a valliant attempt, at making me leave the computer every once in a while. And I think she still believes me a bit when I say PowerDNS is ‘nearly done’. Now I have to believe it too.

Posted in , , ,  | no comments | no trackbacks

The general mediocrity of the world

Posted by bert hubert Thu, 13 Apr 2006 14:39:00 GMT

I like statistics, and I disagree with the famous line “lies, damn lies and statistics”. Without statistics, all we have are data points, and no facts.

One of the better books if you really want to get into this kind of thing is Statistics for Technology.

Ok, why do I mention this? Have you ever noticed how mediocre many things are? Or even, many people? And specifically, programmers and their code! A lot of it only just passes the ‘not obviously broken’ mark.

At first I found this very upsetting, as if people simply don’t want to do a better job. And although this is partially true, we shouldn’t get mad because it is just.. statistics. And why? Because the most common skill level is bound to be somewhere around the median, which is to say mediocre. By definition in fact. Hence all the crappy code and products out there.

(I know there are distributions where this wouldn’t be the case, see the fine book mentioned above).

My grandfather used to say, never get angry at things. And as an addendum, this holds to an even greater extent for numbers. So, let go of your anger. I know I should.

Here goes.

Go read the RFC already

The ‘fine’ company E-Tech just cost me three days of my life in worry. Like I don’t worry enough already. As you may know, the PowerDNS recursor powers the internet access of hundreds of thousands of people these days. And while we’ve tested PowerDNS by sending it billions of recorded packets, and verifying the output against Other Nameservers, some problems cropped up.

Slowly a pattern emerged - some customers who relied on their (ADSL) routers of varying brands to relay their DNS traffic lost the ability to resolve domains the instant we switched them over to PowerDNS, cutting them off from the internet.

Sometimes rebooting the router helped, but often not, or not for long.

So we tried to get test data.. and there was none. Much searching ensued, tcpdump running on heaps of servers. And there was no data. When customers reported the problem, there were no packets to inspect.

But one thing became clear - PowerDNS did not compress the first part of the DNS answer. Compression is an optional part of DNS that allows more data to fit in a packet, and PowerDNS does its best to do that.

However, a programming oversight meant that this first part was not compressed - wasting a few bytes.

And lo, several DNS proxies in crappy routers had decided that compression was not just optional but mandatory. And would crash upon receiving a non-compressed first answer. And not relay any DNS packets anymore.

Which explained why we didn’t find any packets to analyse.

So somewhere in the dreaded E-Tech router, there is code that probably doesn’t even understand compression, but assumes that the magic bytes ‘c0 0c’ are a sort of “start of answer” marker. Which is the impression you might get if you’ve only looked at a few packets and decided to parrot your way out.

Oh well.

Thanks to Paul Duivestein for helping debug this problem, which also affects older firmware revisions of Thompson 510 modems.

Other news

Ok, it is not all debugging. I added a small hack to make the PowerDNS recursor benefit from an additional processor, and boy, does it work well. I implemented no locking, no concurrency, nothing whatsoever. What basically happens is that you run two wholly separate recursors that share one socket. This does double your memory needs, but also completely doubles performance. I’ve tried to max out a dual 3GHz Xeon, but I need to improve my replaying code. Some extrapolation appears to show you could run a sustained 15 to 17,000 dns questions/second on this machine.

Oh, and about my own mediocrity :-) The PowerDNS recursor ran into a lot of broken answers trawling the net for data, which generated heaps of debugging output. Seeking to silence this flood I investigated the broken answers coming from the authoritative servers on the internet. And 95% of them came from.. other PowerDNS deployments. It turns out that the authoritative PowerDNS server simply truncates answers at 512 bytes, and sets the ‘truncated’ bit. This is within the RFC as the RFC doesn’t specify how to truncate.

But the recursor code balked at the partial record at the end of this truncated packet. So I fixed that. I previously harboured angry thoughts at the servers emitting packets that ugly. Oh well.

And it doesn’t stop there. At one point moving over the recursor to the wonderful MOADNSParser architecture, I needed some code to parse incoming answers to figure out their DNS ID, label and status, so I could wake up the proper MThread, which would then fully parse the packet. And what did I use to extract these small bits of information? The full blown MOADNSParser of course. This in turn means that each answer packet gets parsed twice.

I now no longer use the full blown parser to extract this information and this alone reduced CPU load by 20%. See, I know all about mediocrity.

Oh well.. I hope I pass the ‘not obviously broken’ mark.

Posted in , ,  | 5 comments | no trackbacks

Optimisation: sometimes the best move is not to play

Posted by bert hubert Sun, 09 Apr 2006 22:18:00 GMT

Spent some time this weekend setting up my father’s holiday location, which was very good. Time flew by and we had to race to pick up my car, which was on a company property that would be locked up at night. His new place has Wifi, but he is too far removed from the access point. I ordered a very nice directional antenna from the Wifi Shop, hope it shows up soon. Aiming it should be hard work, as getting line of sight requires mounting the antenna on a large pole. Should be worthwhile though!

Movies & PowerDNS

As regular readers of this blog will know, I’m currently working hard on making the PowerDNS recursor the fastest and most ablest nameserver on this planet.

To this end, I’ve done a lot of micro-optimisation, which means trying to make individual functions as fast as possible. Time and time again however, I’ve discovered that gcc 4.1 does a very fine job already.

Until I started finding very major optimisations, which remind me of one of my favorite movies, War Games, which has a famous quote, ‘Sometimes the best move is not to play’. This has turned out to be very apt.

The best way to speed up a program on a modern platform.. is by making it do less. So I spent some time doing that. DNS at its core is case-insensitive, so www.powerdns.com is equal to WwW.PoWeRdNs.CoM. There are basically two ways do go about this. The first is to first lowercase everything, the second is to make your comparison functions case-insensitive.

The problem with the first solution is that, while allowed, many DNS clients will react badly when they receive an answer for ‘www.google.com’ when they asked for ‘WWW.GOOGLE.COM’. So you have to keep the original question around. This costs memory, and time.

Previously PowerDNS followed approach 1. I’ve now moved it to true case insensitvity. I also solved the strange and murky world of the trailing dot, ‘www.google.com’ versus ‘www.google.com.’.

Al this proved to be a small speedup, but it feels a lot better now lowercasing everything all the time.

Even bigger news in the ‘not to play’ department was short-circuiting the whole wonderful MOADNSParser (mother of all etc) structure. This system is wonderful, a joy to program, and about as efficient as it can get while remaining safe.

However - to decode and encode A, CNAME and NS records, it is overkill. So, large parts of PowerDNS now do those records by hand, saving a lot of malloc/free (new/delete) calls. Especially for A records, which constitute the bulk of all queries, this makes a big difference.

I wonder very much why I did not do this earlier, but I think it is because I love the MOADNSParser too much. I should be honest however and know that it is overkill to use so much code for parsing the 4 bytes of an IP address!

PowerDNS appears to be 20-30% faster because of the work today. On FreeBSD, a loaded PowerDNS now spends 25% of its time in the kernel, which tells me we are doing all right.

Posted in , , ,  | 8 comments | no trackbacks

Wardrobe, Statistics, PowerFiler, PowerDNS

Posted by bert hubert Fri, 07 Apr 2006 20:06:00 GMT

Wardrobe

I should be looking snazzy again, spent a heap of money on clothing the past two days. I tend to neglect my appearance, which detracts from my engineering brilliance :-)

Statistics

In other news, I used the Sun T2000 to do statistics on the .COM zone, and learned some performance related things. One trick is if you do stuff on the commandline, it pays to stretch out things over several processes using pipes, as more strands will be busy. This is the opposite of what you do on smaller systems.

Much calculating later, I now have some solid figures on PowerDNS use and I have to say I’m rather pleased. It appears there are far far more PowerDNS users than mailing list subscribers, which makes me a bit sad. These people might miss vital PowerDNS security announcements. So, if you are a PowerDNS user, head over to the mailinglists and subscribe, even if only to the announcements list.

PowerFiler (PowerArchiver)

I announced the open sourcing of PowerArchiver the day before yesterday and have since decided to change the name to PowerFiler as PowerArchiver is an existing product. I also bought powerfiler.net and powerfiler.org and brought them online, but there is no content yet. I’m pondering hosting the entire website on trac, but this is not perfect. But then again, what is.

I made PowerFiler compile on FreeBSD, which required all of a two line change. See the previous entry on how to get PowerFiler!

PowerDNS

Ahh - big day again. You may recall a large Dutch ISP trialling PowerDNS. They are so confident they switched over 3 more of their most important nameservers to PowerDNS. I feel all warm and fuzzy.

Also, I’ve suspected this before, but there are things about PowerDNS I don’t know. Like TUPA, The Ultimate PowerDNS Admin. It looks cool. Then I hear rumours about Druid DNS releasing cool stuff, as well as Frontiernet. I’m very happy to see these projects but please drop me a line if you are developing things on top of PowerDNS, I might be able to help, or at least mention you to other PowerDNS users.

Posted in , , ,  | 17 comments | no trackbacks

PowerArchiver open sourced

Posted by bert hubert Wed, 05 Apr 2006 18:23:00 GMT

For the very adventurous, I’ve just open sourced PowerArchiver on:

       svn co svn://svn.powerdns.com/archiver

From the original ’IDEA’ file:

Allow user to scan file paper documents to folders. One folder might be two pages of an itemised bill, or several pages of a letter. Scanning is a one-button job.

The goal is absolute ease. For example, a folder could be configured while a page is scanning. Furthermore, papers should be as readable on screen as possible. To this end, the user can mark the ‘interesting’ rectangle of a page, which is shown by default such that the entire page fits on the screen.

Has a ton of dependencies, and there is no documentation. But have fun. This program has at least two active users Jasper and me.

The license is GPL of course.

The stuff you expect (trac, homepage etc) will follow soon.

Posted in ,  | no comments | no trackbacks

100% nerd day today

Posted by bert hubert Wed, 05 Apr 2006 15:53:00 GMT

This is not what we call a day you can tell your friends about. Woke up too early given the time I went to sleep and started hacking away and am still at it.

Today I solved a number of very major PowerDNS problems. Makes you wonder how the program ever worked.

Ok, I did do some other things, I ordered a Wifi directional antenna for my father, who wants to be able to access the internet from his holiday location.

On to the PowerDNS bugs.

“Phantom inability to resolve . records”

This happened every once in a while, immediatly on startup. And the problem disappears if you turn on --trace output, so it is nearly impossible to debug. I was about to consider a career in carpenting when, just for fun, I decided to let things run through valgrind.

Lo and behold, it pointed straight to the problem. For generating proper errors, PowerDNS passes an ‘error’ variable around, by reference. This means that the compiler is not sure if you are accessing it uninitialized, and emits no warnings.

Valgrind however paints the entire memory of your application and detects reads from unpainted areas. More power to Valgrind! See commit 658.

“Phantom debugging output on querying ipwhois.rfc-ignorant.org”

This one really had me hunting for ghosts. Just after I committed a heap of cleanups which seemed obvious enough, I ran my benchmarks, with debuging output turned off.

But output appeared… And only for ipwhois.rfc-ignorant.org. Everything about the output was wrong. Earlier in the day, I had tweaked debuging output a bit because of the bug that disappeared while tracing, so I investigated if something was left, but it wasn’t.

I tried lots of other domains, but it only happened with ipwhois.rfc-ignorant.org.

One thing that was very off were the packet counters being displayed, which provided the hint.

Feast your eyes on this:

QUESTION SECTION:

ipwhois.rfc-ignorant.org. IN NS

ANSWER SECTION:

ipwhois.rfc-ignorant.org. 31533451 IN NS localhost.rfc-ignorant.org.

ADDITIONAL SECTION:

localhost.rfc-ignorant.org. 1051 IN A 127.0.0.1

And in the background.. I had another PowerDNS recursor running on port 53, which duly received questions from the version being benchmarked on port 5300. And that version did have debuging output turned on.

Over 8000 SOA records for .COM

Ok, this is a big one. Various people had been reporting that under prolonged and heavy load, PowerDNS would slow down. To a certain extent this is normal: the cache grows, memory gets fragmented, but what people told me was far bigger than this.

On investigating another problem this morning, I dumped the cache of a machine that had been testing for 12 hours.. and found 8000 SOA records for .COM, all slightly different.

Turns out that the SOA serial of .COM gets raised many times during the day but that PowerDNS does not let these new records supplant the old SOA record, but simply adds them.

Why would this slow down the recursor? On emitting an NXDOMAIN, PowerDNS will look up the SOA record of a domain.. and find thousands of records, and try to add all of them to the answer. Luckily it stops after 512 bytes, but still.

I fixed this in commit 657.

Massive numbers of PTR records

Ok, try to resolve 10.64.158.85.in-addr.arpa. It has 877 PTR records, for a total TCP packet of 22 kilobytes. PowerDNS valiantly tries to compress these 877 labels, but failed because you can’t refer to labels with offsets of more than 16384 bytes. Except that the recursor did not care for this limitation and happily inserted larger offsets.

I also removed some ‘magic’ code that tried to be way too smart wrt label compression, but I wonder if the code had some magic uses I can’t figure out right now. Removing it didn’t seem to break anything though.

Tweaking of CLOCK record eviction

Turned out we weren’t scanning enough of the cache to evict enough records to keep a steady cache size. Still an area to watch though. It does appear that the max-cache-entries setting works very well, allowing operators to set a maximum size measured in cache entries.

Posted in , ,  | no comments | no trackbacks

Shopping, pizza, SSHFP

Posted by bert hubert Mon, 03 Apr 2006 20:33:00 GMT

Last week I received an invitation to become a member of a superstore. Due to planning idiocies, large supermarkets aren’t allowed outside of city centers (get this) here in The Netherlands, so you need to be a ‘member’, which is only possible if you have a business. Luckily I have several.

But the odd thing was that I already was a member. But they did send a 20 euro discount voucher! So I went there today to ‘apply’ for membership, and they duly discovered I already was a member. Could I still cash my discount voucher?

Much pondering, calling of supervisors, issuing of stamps later, it was decided I could. 20 euros is no small thing so I immediately splurged lots more on vital stuff like…

Pizza

.. several kinds of ready-made pizza products. Avid readers of this blog will know that I have a worrying addiction to these ancient flat-breads, so why would I try pre-made stuff? Well, I can be stupendously lazy at times, that’s why. I’ve not yet met a good programmer who isn’t lazy, so this is good news.

On to the vital stats. The ‘Nestle’ perishable pizza bottom looks expensive, well made, and even _is_ expensive. When prepared in my pizza oven, it even looks perfect, very thin, very Neapolitan. It tastes likes carton though.

I also tried a smaller no-brand pizza bottom which is thicker and looks less professional, but which tasted a lot better. Nothing compared to my home made dough though.

I bought several bottles of varying kinds of ‘pasta sauce’ and it turns out they are all fine. My new pizzas don’t have a lot of sauce on them, I find that even the very cheap ready-made sauce is very good.

All in all, a worthwhile experience.

PowerDNS

A day that didn’t really go anywhere. Spent quite some time fighting different-endian PCAP files. PowerDNS contains technology to replay recorded DNS streams for verification and analysis purposes, for which it needs to be able to parse tcpdump files. It turns out these come in both big- and little-endian flavours.

Furthermore, Solaris has a 2*64-bit struct timeval, whereas pcap files use regular 32-bit time_t values. So I had to abstract that all out. Didn’t commit it to SVN yet as part of the code doesn’t work yet.

Peter Zijlstra previously educated me on the use of ‘clock algorithms’ for cache pruning. PowerDNS currently prunes based on the TTL of records, which is probably not the best thing to do. A long-lived record has no need to outstay a shorter lived one if it is never queried.

My local sources now put a record in the back of a linked list every time it is accessed or created (many thanks to Joaquín Mª López Muñoz for explaining how this works). When we want to prune, we start with the least used records, which are at the beginning.

When the recursor tries to find a record in the cache and finds it to be expired, it can simply ignore it. It will be refreshed soon anyhow.

It would appear this could speed up PowerDNS a lot, and also enable us to limit ourselves to a fixed amount of memory used (see below).

Also, implemented RFC 4255, SSHFP, which took all of 30 minutes, counting the implementation of hex-encoded records. Without that infrastructure, it would’ve taken 3 minutes.

This does not do anything yet - the recursor does not need to know about SSHFP and the authoritative nameserver doesn’t use the innovative ‘MOADNSParser’ yet. I’ll probably change that before 2.9.21.

Unix

I also spent some time trying to get the linux implementation of getrusage fill out the ‘integral’ memory fields. Turns out no OS I have access to uses these fields as their definition has traditionally been crap. SuSv3 also doesn’t mention these fields at all. So that was some lost effort. It appears you’ll have to do something different on each unix to discover real memory usage.

In working on this, I managed to get UML compiled and working without too much work, which is a first. The UML defconfig is not very good though, will send Jeff Dike a patch.

I did discover mallinfo(2) today, which is present in all unixes it appears, and provides information on the memory allocation subsystem. The numbers nedmalloc output here appear to be bogus though.

Posted in , , ,  | 3 comments | no trackbacks

Older posts: 1 2 3 4