Calculating the chance of spoofing an agile source port randomised resolver
Posted by bert hubert Mon, 04 Aug 2008 22:15:00 GMT
This post sets out to calculate how hard it is to spoof a resolver that takes simple, unilateral, steps to prevent such spoofing.
Unilateral in this case means that any resolver can implement these steps, without changing the DNS protocol or authoritative server behaviour. Everybody that implements the ideas below immediately improves the general security of the DNS.
To save you all the reading, the simple unilateral measures can bring down the chance to be spoofed to under 1% after a year of non-stop 50 megabit/s packet blasting.
Note however that my math or my ideas may be wrong. Please read carefully!
Work so far
Recapping, calculations so far show that a fully source port randomized nameserver can be spoofed with a 64% chance of success within 24 hours, requiring around 0.4TB of packets to be generated. If 2 hours are available, the chance of success is 8.1%.
This assumes that the attacker is able to generate around 50 megabits/s, and also important, that the resolver is able to process 50k incoming responses/s.
Details are on http://www.ops.ietf.org/lists/namedroppers/namedroppers.2008/msg01194.html with the caveat that where it says “1.5 - 2 GB”, it should say “36GB”.
Since that posting, quite a number of people have studied the calculations, and they appear to hold up.
Note that the present post does not address the dangers created by those able to actively intercept and modify traffic - people with such abilities have little need to spoof DNS anyhow.
Situation
The current situation is not acceptable - the resources needed to perform a successful spoofing attack are available, if not generally, then to an relevant subset of Internet users.
It turns out that it takes a lot of effort to get the world to apply even a minor patch that has received tremendous front-page coverage on all the security websites - the source port randomisation patches in this case.
So if we want to move quickly, we need a solution that can be rolled out without having to upgrade large parts of the Internet.
Agile countermeasures
There are a number of strategies a resolver could employ to make itself effectively unspoofeable, some of these include:
A) Sending out questions over TCP/IP
B) Repeating questions a number of times, and requiring the answers to be equivalent
The problems with these two techniques is that they imply a certain overhead and increase the general CPU utilization on the Internet.
Furthermore there are strategies that make spoofing harder, but not impracticable:
C) Case games (‘dns-0x20’)
D) Use multiple source addresses
A very comprehensive enumeration of techniques can be found on http://www.ops.ietf.org/lists/namedroppers/namedroppers.2008/msg01131.html
Sadly, it appears that on this impressive list there is nothing without more or less unacceptable overhead that really gets us out of the woods, where this is arbitrarily defined as reducing the spoofing success rate to below 1% after a year of non-stop trying at 50 megabit/s.
Detecting a spoofing attempt in progress
Since any spoofed response has a chance of around 2^-32 of being accepted, it stands to reason around 2^31 bogus responses will arrive at the resolver before the attacker manages to achieve his goals.
Since we know we have effective countermeasures available, like A and B mentioned above, we could deploy these in case a spoofing attempt is detected. Remember that A and B are generally available, but that we don’t want to resort to them all the time, for all domains, because of their overhead.
Occasionally, port numbers get modified in transit. Additionally, responses to queries sometimes arrive late enough that a new equivalent query has since been sent. This means we should not consider a single response mismatch to be a sign of a spoofing attempt in progress.
If we allow X mismatches before falling back on A or B, the chance of a single query being spoofed is:
X --------- ~= 2^-32 * X N * P * I
Assuming each individual attack lasts W seconds (being latency between the authentic authoritative server and the resolver), the combined spoofing chance after T seconds becomes:
T/W
1 - (1 - 2^-32 * X) ~= 2^-32 * X * T/W
Putting in 20 for X and 0.1s for W, this gets us a combined spoofing chance of 0.4% for a full day. Interesting, but not good enough, especially since the attacker might well send only X packets per attempt, and launch far more attempts.
However, if the attacker has a defined goal about what to spoof, each successive attempt might be for a different domain name, or differ in other respects, but all those attempts will share some characteristics.
Two things that will be identical, or at least reasonably unique are the source address of the spoofed packet (aka, the network address of the authentic authoritative server), plus the ‘zone cut apex’. This last bit requires some understanding of the way a resolver works.
(I made up this phrase, ‘zone cut apex’, but I think there are people with better knowledge of DNS verbiage than I, I’d love to hear of a better name)
When a resolver asks NS1.EXAMPLE.COM for ‘this.is.a.long.example.com’, the resolver knows it is asking NS1.EXAMPLE.COM in its role as ‘EXAMPLE.COM’ nameserver - this is how it selected that server.
This means that an attacker might try many variations on ‘this.is.a.long.example.com’, what will remain identical is the ‘zone cut apex’, which is ‘EXAMPLE.COM’. What will also remain identical is the (small) set of example.com servers available.
I’ll get into more detail after Dan Kaminksy has held his presentation. The upshot however is that multiple different attempts can be correlated, and thus be counted together in the spoofing-detection counts.
If we conservatively decide to impose a 5 minute ‘fallback to A or B’-regime for a {source IP, zone cut apex}-tuple, and leave the detection limit at 20, this means an attacker will have one chance every 5 minutes of getting in an attempt.
This is equivalent to setting W equal to 300 seconds above, yielding us a combined chance of spoofing a domain after a year of trying of 0.05%.
Well within our goal.
Reality intrudes
Sadly, the reality is that we won’t recognize all spoofed packets that guessed wrong, so to speak. Typical operating systems will only let a nameserver know about packets that arrived on a socket open for that purpose.
In the very worst case, the server is only listening on a single port, and by the time a single mismatch is received by the nameserver process, on average 32000 will have arrived on the network interface.
This means that in the calculation above, if we don’t take additional measures, we need to set X to 32000, leading to a combined monthly spoofing chance of 6.4% (and a yearly near-certainty).
Fine tuning things
By raising the fallback period to an hour, the yearly spoofing chance becomes 6.5%, assuming we only see 1 in every 32000 spoof attempts.
If in addition a small number of sockets is kept open, say 10, to function as ‘canary ports’, X reduces to 3200, and the yearly spoofing chance is back at a low 0.65%.
Canary ports serve no function except to detect spoofing attempts. For efficiency reasons, these ports may simply be ports that had previously been used for source port randomisation, but kept around somewhat longer.
The number of canary ports, and the fallback period can be tuned at will to achieve desired spoofing resilience levels.
Remaining Problems
Countermeasure ‘A’ does not work for domains not offering TCP service. Countermeasure ‘B’ does not work for domains where single authoritative servers generate differing answers to repetitive questions. This might happen in case of (hardware) load balancers, or load balanced desynchronised nameservers.
Best results might be achieved by alternately trying countermeasure A and B - any server that does not support TCP and sends out inconsistent replies is in for some tough love if someone attempts to spoof the domains it hosts.
Conclusion
If the calculations and ideas elaborated above hold up, it appears feasible to achieve arbitrarily low spoofing chances, without doing any further protocol work.
Importantly, such changes would allow individual resolvers to protect their users without having to wait for changes to all authoritative servers.
In other words, everybody who participates receives an immediate benefit.