Posted by bert hubert
Sun, 17 Jun 2007 21:07:00 GMT
Ok, people have been harassing me that I should update my blog more often. This strikes me as somewhat odd, blogging is not mandatory - sometimes I feel the need to share some thoughts with my readers (it appears there are 3000 of you!), and sometimes I don’t.
I still don’t have a lot to say, but perhaps this might interest you.
“The trouble with physics”
I’ve been following cold fusion, and other ‘alternative’ physics subjects for a long time now, and I keep tabs on quite a number of interesting investigators. Over time it has become clear to me that physics is dangerously locked in to the ‘mainstream’.
Careers are built on getting grants; grants are disbursed by risk-averse boards, journals are very worried about their reputation, and rely on vested scientists to review papers. The upshot of this is that it is very dangerous for a physicist to do ‘interesting’ research.
I felt like this for a long time, but as I’m not part of the physics community, and in fact never got past half of my physics degree, what I feel is not very interesting.
However, when Lee Smolin feels something, it is. He’s written a captivating book called ‘The trouble with physics’. Its main point is that physics has become stuck in a rut called String Theory, which is a complicated set of ideas that has for decades been hailed as the next big thing.
Dr Smolin describes the current state of physics very well, and he appears to confirm the feelings I describe above.
I heartily recommend this book, it is one of the few books that continue where an earlier generation of ‘books for laymen’ stopped.
There are some indications the physics community is more open to ‘interesting’ results again, which should be very good. I’ve made a small list of things I find interesting, and keep track of.
“Cold Fusion”, or as it is often called these days “Low Energy Nuclear Reactions”
I’ve blogged about this before, but this field appears to be heating up again in a big way. Basically, hot nuclear fusion (which powers the sun, as well as hydrogen bombs), would solve most of our energy problems. However, it turns out to be very hard to make a hot nuclear fusion reactor that survives its own operation AND generates energy.
Cold fusion started out by the claim of Messrs Pons and Fleischman to have found proof of hydrogen fusing under ‘kitchen table’ conditions. It quickly turned out nobody could (reliably) reproduce their results, and controversy ensued. Additionally, our current understanding of physics appears to prohibit ‘cold fusion’.
However, over the following 18 years, it never went away entirely. There is a slow but steady trickle of results that appear to form the smoke to a possible fire. Dr Dieter Britz keeps track of all cold fusion related papers and reports, his database now contains over 1200 items.
Some of the die-hards in researching cold fusion have been a group of employees of a US Naval laboratory, called SPAWAR. Recently, they’ve developed a very simple experiment that reproducibly shows signs of “low energy nuclear reactions”. There has now been at least one replication of their simple experiment, which appears to show the same signs.
The experiment is simple enough that it can be performed at home, and I am sometimes tempted! I’ve since found that quite a number of replications are already going on, so no need to try to build a laboratory at home :-)
More information can be found here
Strange gravity effects
Much of the same goes for experiments with rotating superconductors affecting gravity. It should be realised that gravity is truly unstoppable, as far as we know, there is nothing that could ever ‘shield’ oneself from this universal force.
If one could do that, spaceflight would become a lot easier. It would also put a rather large dent in our understanding of physics - although gravity is poorly understood anyhow.
The Russian metallurgist Evgeney Podkletnov grabbed the attention around 1992 and 1996 with papers describing gravity shielding above a rapidly rotating superconducting disk. The problem was that his disk was very hard to make, so the experiment was not easy to reproduce.
His reports were interesting enough to get NASA to try however, but they never really managed to replicate his conditions. Interestingly, one of the theorists (Ning Li) involved with the experiment appears to have vanished!
This is the stuff of conspiracy theories, but it has been reported that Boeing has at one stage been involved in making devices based on this theory, but this has been widely denied. Meanwhile, Podkletnov has withdrawn some of his papers. All very messy.
However, some time ago, a scientist working for the European Space Agency, made similar claims, which are however very different in detail. Interestingly, Tajmar cs also have theories on why their spinning superconductors produce gravity effects.
It appears their work is being taken seriously. I’ve been in contact with them, and although they didn’t want to reveal a lot, they did say they expected to report new results.
Less controversial, but no less strange, is the current state of our understanding of gravity, which includes such incredible things as invisible objects which do have gravity (dark matter), as well as invisible things that offer ‘negative gravity’ (dark energy). We currently only know that we need these dark things to explain the universe - we just don’t know what lies behind these science fiction-like names!
Unlike other things mentioned in this post, dark energy and dark matter are 100% part of mainstream physics - even though we have only faint ideas on the physical nature of these forms of ‘matter’.
Steorn
I’ve blogged about this fascinating company before, so I’ll only post an update here.
It is hard to figure out their strategy. They claim to have discovered a device which generates free energy, and that they are trying to make some money from this invention, while also making it generally available. They’ve assembled a jury of 22 scientists which is supposed to validate their technology, but this is expected to take a long time.
In the meantime, their CEO has been posting quite a lot on their forum, dropping hints on how their device works, while otherwise retaining a high level of secrecy.
One of the forum members, Mike Rosing (known as ‘drmike’) heard enough to design an experiment to test at least part of what Steorn intimates lies behind their technology.
This revolves around ‘magnetic viscosity’, which is one of the darker areas of how permanent magnets work. Drmike now has data, but no results yet, as he has to extract these from his heaps of data.
I’ve been in contact with Mike, and we’ve worked out something I might try to program to extract results from his data, but I didn’t yet find time to work on it.
Steorn is said to demonstrate their device in London in July, and Mike and others are going to see this demonstration, and I’m again sorely tempted to join in :-)
More information can be found on the two blogs that follow Steorn, called Free Enery Tracker and Dispatches from the Future.
no comments
Posted by bert hubert
Wed, 21 Feb 2007 22:13:00 GMT
Enjoyed a fun and stimulating “DNS & Crypto Power Lunch” with Dan Bernstein (left) and Tanja Lange (not in picture). As was to be expected, the intersection of cryptography and (secure) DNS was discussed, and some evil plans might ensue! If implemented in djbdns and PowerDNS, we might actually achieve something..
Posted in PowerDNS, Netherlabs, Life | no comments
Posted by bert hubert
Thu, 08 Feb 2007 21:39:00 GMT
Just a quick note that I’ll be presenting at The future of VoIP 2 event as organised by the Internet Society of The Netherlands, part of the (global) “Internet Society”.
The event takes place on the 15th of March, in The Hague. For more details, see the links above.
As always, I love to meet PowerDNS users, or in fact, anybody interested in doing interesting things with DNS. So should you be there, it would be good to talk.
Posted in PowerDNS | no comments
Posted by bert hubert
Sun, 04 Feb 2007 12:14:00 GMT
Ok, I’m going to lecture a bit, a bad habit of mine. The summary is that an important enhancement of the Linux kernel has been proposed, but in order to understand the significance of this enhancement, you need a lot of theory, which follows below.
I use the word “computer” sometimes when I properly mean “the operating system”. This exposes a problem of this post, I’m trying to explain something deeply theoretical to a general audience. Perhaps it didn’t work. See for yourself.
Doing many things at once
People generally tend not to be very good at doing many thing at once, and surprisingly, computers are not much different in this respect.
First about human beings. We can do one thing at a time, reasonably well. There are people that claim they can multi-task, but if you look into it, that generally means doing one thing that is really simple, while simultaneously talking on the phone.
This is exemplified by how we answer a second phone call, ie, by saying “The other line is ringing, I’ll call you back”, or conversely, telling the other line they’ll have to wait.
We emphatically don’t try to have two conversations at once, and even if we had two mouths, we still wouldn’t attempt it.
Let’s take a look at a web server, the program that makes web pages available to internet browsers. The basic steps are:
- Wait for new connections from the internet
- Once a new connection is in, read from it which page it wants to see (for example, ‘GET http://blog.netherlabs.nl/ HTTP/1.1’).
- Find that page in the computer
- Send it to the web browser that connected to us
- Go to 1.
Compare this to answering a phone call, step 1 is the part where you wait for the phone to ring, and answering it when it does. Step 2 is hearing what the caller wants, step 3 is figuring out the answer to the query, 4 is sharing that answer.
This all seems natural to us, as it is the way we think. And programmers, contrary to what people think, are human beings, too.
Where this simple process breaks down is that, much like a regular phone call, we can only serve a new web page once the old one is done sending.
And here is where things get interesting - although we people have a hard time doing multiple things at once, we can give the problem to the computer.
What is the easiest way of doing so? Well, if we want to increase the capacity of a telephone service we do so.. by adding people. So on the programming side of things, we do the same thing, only virtually: we order the computer (or more exactly, the operating system) to split itself in two!
The new list of steps now becomes:
- Wait for new connections from the internet
- Once a new connection is in, split the computer in two.
- One half of the computer goes back to step 1, the other half continues this list
- (2) Read from it which page it wants to see (for example, ‘GET http://blog.netherlabs.nl/ HTTP/1.1’).
- (2) Find that page
- (2) Send it to the web browser
- (2) Done - remove this “half” of the computer
I’ve prefixed the things the second computer does with ”(2)” . This looks like the best of both worlds. We can “serve” many web pages at the same time, and we didn’t need to do complicated things. In other words, we could continue thinking like human beings, and use our intuition, by thinking of the analogies with answering phone calls.
So, are we done now? Sadly no. What basically has happened is that we have invoked a piece of magic: let’s split the computer in two. That is all fine, but somebody has to do the splitting. This job is farmed out to the CPU (the processor) and the operating system (Windows, Linux etc), and they have to deal with making sure it appears the computer can do two things at the same time.
Because the truth is.. people can’t do it, and neither can computers. They fake it.
This faking comes at a cost, incurred both while splitting the computer (“forking”), and by making the computer juggle all its separate parts. Finally, it turns out that practically speaking, you can divide a computer up into only a limited number of parts before the charade falls down.
Busy websites have tens of millions of visitors, we’d need to be able to split the computer into at least that many parts, while in practice the limit lies at perhaps 100,000 slices, if not less.
Now what
Several solutions to this problem have been invented. Some involve not quite splitting up the entire computer and making split parts share more of the resources (like for example, memory). This is called ‘threading’. Perhaps this could be compared with not hiring more people to answer the telephone, but instead giving the people you have more heads, so as to save money.
In the end, all these solutions run into a brick wall: it is hard to maintain the illusion that the computer can do multiple things at the same time, AND have it actually do a million things at the same time.
So in the end, we have to bite the bullet, and just make sure the program itself can handle many many things at once, without needing the magic of pretending the computer can do it for us.
“Asynchronous programming”
This is where things get hard, and this is to be expected, as it was our basic premise that people can’t do multiple things at the same time, and what’s worse, they have a hard time even thinking about what it would be like.
The new algorithm looks like this:
- Instruct the computer to tell us when “something has happened”
- Figure out what happened:
- If there is a new connection, instruct the computer that from now on, it should tell us if new data arrived on that connection
- If something has happened to one of those connections we’ve told the computer about, read the data sent to us on that connection. Then find the information requested on that connection, and instruct the computer to tell us when there is “room” to send that data
- If the computer told us there was “room”, send the data that was previously requested on that connection. If we are done sending all the data, tell the computer to disconnect, and no longer inform us of the state of the connection.
- Go back to 1.
If this feels complicated, you’d be right. However, this is how all very high performance computer applications work, because the “faking” described above doesn’t really “scale” to tens of thousands of connections.
How does this translate to the telephone situation? It would be like we have lots of small answering machines, that lots of callers can talk to at the same time. Whenever someone has finished a question, the operator would listen to that answering machine, and leave the answer on the machine, and go on to the next machine that has a finished message.
From this description, it is clear it would not work faster that way if you’d try it for real. However, in many countries, if you call a directory service to find a telephone number, you’ll get half of this. Your call is answered by a real human being, who asks you questions to figure out which phone number you are looking for. But once it has been found, the operator presses a button, and the result of your query is sent to a computer, which then reads it to you, allowing the operator to already start answering a new call. Rather smart.
Something in between
If the previous bit was hard to understand, I make no apologies, this is just how complicated things are in the world of computing. However, we programmers also hate to deal with complicated things, so we try to avoid stuff like this.
People have invented many ways of allowing programmers to think ‘linearly’, as if only a single thing is happening at the same time, without having to split the entire computer.
One way of doing this is having a facade that makes things go linearly, until the program has to wait for something (a new connection, “room” to send data etc), and then switch over to processing another connection. Once that connection has to wait for something, chances are that what our earlier ‘wait’ was waiting for has happened, and that program can continue.
This truly offers us the best of both worlds: we can program as if only a single thing is happening at the same time, something we are used to, but the moment the computer has to wait for something, we are switched automatically to another part of the program, that is also written as if it is the only thing happening at the same time.
Actually making this happen is pretty hard however, because traditional computer programming environments don’t clearly separate actions that could lead to “waiting” from actions that should happen instantly.
A prime example of the first kind of action is “waiting for a new connection” - this might in theory take forever, especially if your website is really unpopular.
Things that should happen instantly include for example asking the computer what time it thinks it is.
Traditional operating systems can be instructed to be mindful of new incoming connections, and not keep the program waiting for them. This is what we described in the complicated “if X happened, if Y happened” scenario above.
They can also do the same for reading from the network and writing to the network, both things that might take time. This means you can ask the operating system ‘let me know when I can read so I don’t have to wait for it, and I can process other connections in the meantime’.
Furthermore, there are some limited tricks to do the same for reading a file. The problem is that back in the 1970s when most operating system theory was being invented, disks were considered so fast, nobody thought it possible you’d ever need to meaningfully wait for one. Of course disks weren’t faster back then, but computers were slower, and massively so. So by comparison, disks were really fast.
The upshot is that in most operating systems, disk reads are grouped with “stuff that should happen instantly”, whereas every computer user by now has experienced this is emphatically not the case.
Modern operating systems offer only a limited solution to this problem, called ‘asynchronous input/output’, which allows one to more or less tell the computer to notify us when it has read a certain piece of data from disk.
However, it doesn’t offer the same facility for doing a lot of other things that might take time, like finding the file in the first place, or opening it. Things that in the real world take a lot of time.
So, we can’t truly enjoy the best of both worlds as sketched above, which would mean the programmer could write simple programs, which would be switched every time his program has to wait for something.
Enter ‘Generic AIO’
Zach Brown, who is employed by Oracle to work on Linux, has now dreamed up something that appears to never have been done before: everything can now be considered something that “might take time”.
This means that you can ask Linux to find a certain file for you, and immediately allows you to process other connections that need attention. Once the operating system has found the file for you, it is available for you without waiting.
Although almost every advance in operating system design has at one point been researched already, this approach appears to be rather revolutionary.
It has ignited vigorous discussion within the Linux community about the feasibility of this approach, and if it truly is the dreamt of “best of both worlds”, but to this author, it surely looks like a breakthrough.
Especially since it unites the worlds of “waiting on a read/write from the network” with “waiting for a file to be read from disk”.
Time will tell if “Generic AIO” will become part of Linux. In the meantime, you can read more about it on LWN.
Posted in Linux, Netherlabs, Life | 2 comments
Posted by bert hubert
Fri, 12 Jan 2007 21:16:00 GMT
The workings of the Internet are described, or even proscribed, by the so called ‘Requests For Comments’, or RFCs. These are the laws of the internet.
Today the IETF DNS Extensions working group accepted an “Internet-Draft” Remco van Mook and I have been working on. And the cool bit is that over time, many such accepted “Internet-Drafts” turn into RFCs!
Read about it what our draft does
here
and here.
The actual Internet-Draft can be found over at the IETF, or over here as pretty HTML.
In short, this RFC documents and standardises some of the stuff DJBDNS and PowerDNS have been doing to make the DNS a safer place.
Besides the fact that it is important to update the DNS standards to reflect this practice, it is also rather a cool thought to actually be writing an RFC, especially one that has the magic stanzas “Standards Track” and “Updates 1035” in it.
So we are well pleased! Over the coming months we’ll have to tune the draft so it confirms with the consensus of the DNSEXT working group, and hopefull somewhere around March, it will head towards the IESG, after which an actual RFC should be issued.
Exciting!
Posted in Linux, PowerDNS, Netherlabs, Life | 10 comments
Posted by bert hubert
Mon, 01 Jan 2007 15:58:00 GMT
I wish everybody a very good 2007! For PowerDNS, it certainly has been a very good year.
In some (large) places, the Recursor now commands a 40% market share, while the authoritative server is also expanding its user base around the world, with multi-million domain deployments now no longer as newsworthy as they once were.
The Chaos Computer Club held its annual congress last week, and they chose the PowerDNS Recursor to provide the DNS service to go with their 10 gigabit connection. I’m pleased to report that the PowerDNS process was fired up only once, and that it held steady for the entire congress, with no complaints. This would usually not be that strange, but the CCC clientèle are among the most critical internet users to be found on the planet.
Many thanks to Stefan Schmidt and other CCC admins for their vote of confidence!
Rails
I’m working on understanding ‘Ruby on Rails’, which will probably end up as a HOWTO aimed at seasoned programmers. The internet abounds with “you won’t believe how easy Ruby on Rails is” demonstrations, but the hard truth is that below the surface, a lot of magic is happening. The kind of magic the discerning programmer wants to grasp so as to make the most of it.
A very small start to this HOWTO can be found here.
It may also allow experience programmers to teach themselves Ruby in less time than it would take them to read a 750 page book.
Posted in Linux, PowerDNS, Netherlabs, Life | 7 comments
Posted by bert hubert
Thu, 14 Dec 2006 21:21:00 GMT
After PowerDNS 3.1.4 turned out to be boringly stable, fixing all reported crashes, I decided it was time to do the massive speedup I’d been promising people for some time.
With some help from my friends over at #offtopic2, I was able to use the TSC register of my CPU to measure down to the nanosecond how much time things were taking within PowerDNS. Previously I’d concentrated on profiling macro performance, but nanosecond resolution allows one to study fully how much time is spent within each function.
Using this technique, it became apparent we take a whopping 60 microseconds to answer even the most basic of questions. We make up for this by being pretty fast at complicated questions. But 60 microseconds mean we are limited to about 15000 questions/second, max.
First I started shaving microseconds. It turns out snprintf is truly slow, taking up to 5 microseconds for some strings. Additionally, we wasted a lot of time on needlessly copying std::strings.
The unsurpassed Boost::Multi_Index container has a spectacular feature, called ‘compatible keys’, which means we can lookup answers using a question key that is a bare piece of memory instead of a proper std::string. This again saved a few microseconds.
Put together, this brought down the 60 usec to perhaps 40, which is nice, but not stunning.
But the big savings only came when I did the only thing that actually makes code fast: do less.
So - when encoding the answer to a question, we no longer do the whole “DNS label compression”-routine, as we know the “label” of the answer to a question can always be encoded as the fixed bytes 0xc00c - we don’t need to calculate it.
Going beyond that, when generating a simple answer, don’t generate an answer packet, but simply tack on the answer to the original question, and update the ‘answer count’.
Also, if we see we have an ‘instant answer’ available for a question, don’t bother to launch a whole ‘MThread’ to generate it, but return synchronously.
The upshot of all this is that we can now answer most questions in… 4 microseconds, down from 60. 15-fold speedups are rather rare usually.
We didn’t speedup everything that much though, only the majority of queries. However, even the uncached queries will benefit from the microsecond shaving performed earlier, and run around twice as fast.
I can’t wait to do a live benchmark on all this. I’m estimating we should now be able to do over 50000 “real” queries/second on a 3GHz P4, which would put us an order of magnitude above the open source competition, and even beat, by a large factor, the numbers I hear quoted for commercial alternatives. These are hard to compare as their numbers are under NDA.
It might not even be easy to generate that much testing data..
Will keep you posted!
Posted in Linux, PowerDNS, Netherlabs | 1 comment
Posted by bert hubert
Fri, 24 Nov 2006 21:55:00 GMT
Yesterday I visited a “software development seminar” of ASML, a rather well disguised recruiting event of this Dutch manufacturer of the world’s most advanced lithography machines.
When I studied physics, I organized the Delftse Bedrijvendagen, the then largest carreer fair for university students of The Netherlands. As part of that, I was exposed to almost all recruiters of large Dutch companies, including ASML. And the ASML people never failed to leave me light headed.
In brief, lithography is a major piece of the process of actually making chips. It is the part where you actually put the chip on the substrate, using high energy photons. Current 65nm chips consist of many layers, each of these layers needs to be overlaid with the previous one to a precision of a few nanometres.
To achieve this precision, the individual positioning tolerances of the wafer need to be exact within a nanometre. This is a stunning achievement in itself. For those of you in the non-metric world, there are around 25 million nanometres to an inch. So you should be impressed.
However, this is nothing yet. The lithography machines (‘wafer steppers’) are very expensive, as is the facility that hosts them. And, as there are many layers in a chip, the actual speed of the wafer stepper is of utmost importance.
The machines ASML builds actually illuminate the ‘reticle’ at speeds exceeding 5 metres a second. This is 11 miles/hour. At nanometre precision.
You should have progressed beyond “impressed” to “stunned” by now.
But this is nothing yet. As in microscopy, where water is used to improve resolution, it makes sense to immerse your chip in water while it is being exposed. So the ASML people do that. At nanometre precisions, at those stunning speeds.
To put things in perspective, the wafer is NOT flat to within a nanometre, it bends a bit. So to achieve the precision desired, the wafer is first scanned, so all its imprecisions can be compensated for.
Extreme stuff. I’m sure they don’t have this in “Star Trek”.
I left the event deeply confused - I’m already completely busy with everything I do, and PowerDNS is getting to be quite the empire. The rest of my business is doing great as well.
But my physics background makes me appreciate the incredible things happening over at ASML. Oh well. Like any job, I’m sure it would have downsides. Also, I’m not the kind of person to hold a regular job. But if you want to do stuff on the leading edge of technology, you should at least consider working there. I hear they have 300 vacancies planned for software engineers. They also have some blogs, by the way.
Their current challenge is to move their 15 million lines of C to a new platform that will control their next generation of devices, some of which need to move terabyte amounts of data in under a second.
Anyhow, the seminar was interesting. Tom Gilb presented his “Evolutionary Project Management” concepts, which match rather well with how I tend to manage my projects. One of his main points is that when people start to apply “waterfall” diagrams to software projects, you are lost anyhow. I thought so all along, but it is nice to hear a “guru” confirm it.
Inspired by the breakthrough technologies over at ASML, I’ve picked up my own speech recognition research again, after an 18 month hiatus. The initial results bode well. I get very good frequency and time definition on real speech, with code totalling 750 lines. I hope to get some actual recognition going in the coming week.
Posted in PowerDNS, Netherlabs, Life | 9 comments
Posted by bert hubert
Thu, 23 Nov 2006 10:02:00 GMT
Within the last 12 months, both of my parents have passed away, both after prolonged illness. Here you can see them in happier times a few years ago.
We’ll miss them terribly.
Posted in Life | 5 comments
Posted by bert hubert
Sat, 04 Nov 2006 22:26:00 GMT
Back after a hiatus of 23 days. We are still spending quite some time at the hospital, but getting used to it. After seven weeks or so the panic starts to wear off.
For all you Americans
Please go vote if this is an election year for you. Although I’m here in The Netherlands, I care as the effects of your vote are felt around the world.
Andy Tanenbaum of Minix fame is keeping a database of polls, this is his current projection for the new US senate and house:

I’m not telling you what to vote, but please don’t waste the chance to influence the shape of the world for the next few years!
6 comments