Determining frame forwarding latency
In some situations the question arises how much a frame was delayed by a device it has to pass through, e.g. firewalls, loadbalancers and sometimes even routers and switches. Usually, novice network analysts think that for that you need to synchronize the clocks of the capture PCs down to microseconds or even better, but that is not necessary for this kind of reading. It is possible to capture the packets with completely different time settings on the capture PC left and right of the device you need to determine the delay for.
The trick
The trick to determine the device delay is pretty simple: you just need to find a packet that is answered by another packet coming back. As long as you can match request and response on both sides of the device you can do the math. Of course you need to capture both request and response packets at the exact same moment, so you can’t capture one side first and the other side second. Which means that you usually need two capture PCs, one for each side of the device in question. Take a look at the following graph:
As you can see, you can determine device delay by subtracting the smaller delta time from the larger delta. This is the delay for the two packets to pass through the device, represented by the two small boxes in the graph above. For one way (a.k.a. the average), just divide the result by 2.
Example
For a very simple example I sent a ping (yes, “one ping only!” 🙂 ), which gets NATted, to make it a little more fun. This is the internal capture of the NAT device:
And here is the same ping after being NATted, on the external side of the NAT device:
You can’t see the absolute time (as I’ve chosen to cut it from the screen shot, because it just doesn’t matter), but you can see the delta times between the echo request going out to the internet, and the echo reply coming back.
Internal Ping Delta Time: 0.031064 seconds External Ping Delta Time: 0.030690 seconds Difference between the two: 0.000374 seconds
So the delay caused by the device between the two echo request and the two echo reply packets is 374 microseconds. Divided by 2 that’s 187 microseconds – this is the average delay caused by the NAT device. Of course it may be useful to take multiple samples to get a more exact average, but that is just routine work – knock yourself out if you want. I’m done here 🙂
Jasper,
If I can I take a TCP dump or IPtrace with the option of capturing all packets on all interfaces, both support this function. Then compare the inbound and outbound times of the packets. Same thing really but you dont need the two PCs
Chris,
thanks for your comment!
Wireshark/dumpcap also allows capturing on multiple interfaces, so you could use one PC with at least two cards and capture on both sides of the device in question. What I never do is capture ON the device in question, because if I suspect that it is the source of my problems I cannot trust anything it does, including recording its own packets. Especially when it comes down to timings.
The delta time trick also works for large distances, e.g. if you want to measure latency between two distant capture points to see how much packets are delayed in a certain network segment. In that case you usually have no chance of placing a single capture device but need to deploy two. And the delta time calculation still works in that case.
Thanks, Jasper. It’s a really simple way.
The only thing you have to consider is that frame forwarding delay can be not equal in both directions – depending on queueing or firewall/NAT rules.
So, you can have not 2 x 0,5s (let’s say) delay, but 0,1s + 0,9s giving us the same total.
Hi Vladimir,
yes, that’s correct – this method determines the average of the frames passing the device, so if one way takes significantly longer than the other the result will not be as accurate as it could be. For that kind of scenario you’d need to use a capture device that is capable of capturing on both sides at the same time with specialized capture cards to avoid time problems you’d run into with normal PC NICs.
I also like to do bi-directional traceroutes from either end on a multi-hop connection to determine the device and link to look at closely.