Tracewrangler was always supporting IPv6 from the start (even though without extension headers except fragmentation), but last weekend I realized that I could improve the sanitization feature due to something that is missing compared to IPv4: subnet masks. This may sound funny, but in fact the missing subnet masks help.
The subnet mask conundrum
Imagine you want to sanitize these two IPv4 addresses found in two packets:
Let’s assume we replace 192.168.1.1 with 172.16.1.1 – can we assume that 192.168.1.129 is in the same subnet, and replace it with 172.16.1.129? Uhm, no. We don’t know if the network mask is 255.255.255.0, or maybe 255.255.255.128, or any other smaller value. So unless someone tells us what the subnet mask for each IP address is we have a problem.
With IPv6, this is not a problem anymore. The prefix is always 64 bit wide, and the interface ID holds the remaining 64 bits. So when I see two IPv6 address with the same 64 bit prefix I can sanitize them and put them in the same new “subnet”, assigning the same replacement prefix. So I thought I do a quick overhaul of the IPv6 sanitization form to allow that kind of easy randomization setting instead of the same mechanism like IPv4:
Of course the “full randomization” feature is still available, randomizing the full 128 bit for each address. But the default is now “randomize both and synchronize prefixes”. This means that each new prefix that is found will be replaced with a new, random prefix same as the interface ID (the “host part”). But IPv6 addresses with the same original prefix will end up with the same new prefix, keeping the IPs in the same network automatically. If you want, you can even choose to keep the original interface IDs, if you have no problem with them staying the same (which is not the default because of EUI64 based addresses potentially exposing MACs). Nice.
Not so nice…
The trouble started when I started changing the actual sanitization/address replacement logic. I realized that my old code had not worked correctly under all circumstances, so what I thought would be a quick change turned into almost two full weekend nights of code overhauling.
You wonder what’s so complicated about replacing a couple of IPv6 addresses? Well, let’s do a little experiment/challenge (if you like) – take a look at the following IPv6 Neighbor Solicitation packet (kudos to Cloudshark for providing the neat nesting feature for single frames). So, can you sanitize the IPv6 addresses so that the findings stay completely the same compared to the original, when an analyst examines it?
By the way, here’s the file in Cloudshark, in case you want to see more or download it: https://www.cloudshark.org/captures/f4909502e0b9
If you’ve tried (or chose to not bother to), read on, and I’ll lead your through the steps required. There’s a lot more to this than probably meets the eye.
- The first address is the source address, which is fe80::f070:aeff:feec:6830. Two things are noticeable for this address: it’s a link local address, and it’s based on the stations MAC address (EUI64). From a sanitization point of view we might argue that we can leave this address as it is, because it is just a link local address and cannot be accessed from outside the local network. But since it is based on a MAC address it may allow identification of the network card and the PC it is used by, and we probably don’t want that. So it needs to be replaced. Of course we have to replace the original IPv6 address with another link local address, but we also need to keep it an EUI64 address, based on a MAC. So our first step turns into two sub steps: sanitizing the original MAC address it is based on, and generate a new IPv6 address based on the new sanitized MAC. Which automatically means that we have to store MAC address replacements as well as IPv6 address replacements, and replace all other occurrences of the same original MAC with the new MAC.
- The destination address is ff02::1:ffec:7458, which is a Multicast address, and a very special one: the Solicited Node Multicast address. It is a really important address which is used to find neighbor systems when their link layer address is unknown (remember: no more ARP for IPv6, it’s all IMCPv6 and Solicited Node Multicast now). Now this IP is a troublemaker, and not in a small way. First of all, the address needs to stay a Multicast address (easy), it needs to be replaced (still easy), and it needs to stay synchronized with the new target address it is the solicited node address for (ouch!). And we don’t know that target address yet (double ouch!). Last but not least, the destination MAC address also needs to be synchronized with the solicited node address. You can see both share the same last four octets: ff:ec:74:58.
- The target address is found in the ICMPv6 layer: fe80::f080:37ff:feec:7458. As you can see, it has the same last three bytes as the solicited node multicast address, as it should have. And after sanitization, this needs to be true again! To make things more complicated, this is yet another EUI64 address, so we need to deduct the MAC address it belongs to, create a new MAC, store the mapping, and create a new link local address based on the new MAC. Then, when we have the new address, we need to determine it’s Solicited Node Multicast address for step 2.
- The ICMPv6 layer also holds a Source link-layer address, which is the MAC address the Neighbor solicitation was sent from. You can see that it’s the same as the source MAC, so this needs to be replaced in sync. And remember, the source MAC is part of the EUI64 formatted source IPv6 address, so those two are already linked to each other.
Now, all of those steps are normally performed top down, meaning from ICMPv6 layer down to Ethernet. This is because the lower layers need to know what they’re carrying. Problem is, with sanitization you often need to look at things in other layers, and cannot simply replace values only within the current layer. This is especially true for IPv6, because it has a lot of “interaction” between layer 2 and 3, and even 4 (if I consider ICMP layer 4).
Sanitizing packets is not as easy as it may seem. If you’re just patching some address bytes you’re more often than not going to end up with a capture file that makes no sense anymore, unless you are very thorough. Especially when there are as many dependencies of MAC, IP address and ICMP as you have with IPv6. You also need to have a database to look up previous replacements, because otherwise you’ll loose the dependencies between multiple frames. Some tools try to avoid this by using hashing algorithms on addresses (which will keep them consistent without the need of a database), but that kind of replacement fails miserably for obvious reasons in cases like the one I just presented.