Diagnosing intermittent “network” problems

There’s that one thing that customers usually ask, and that question is if I would be able to help diagnosing a problem on the network. My answer has two parts:

  1. If we can capture the problem situation in packets, I will find it
  2. When I find it, I’ll tell you if it’s a network problem (which, in my experience of over 10 years, is the case in only about 20%)

The trouble is: there are some problems where it’s not easy to capture packets, and that’s when you don’t know the correct capture location (usually meaning “there’s too many possible locations”) or the exact time for the packets with the symptoms to be recorded – or both.

A look at a portable USB3 network TAP

A while ago I wrote a post for LoveMyTool about how I managed to power my Garland Gigabit TAP with a USB cable, which got me into a discussion about the ProfiTap USB3 device on Linkedin. I had used 100Mbit USB2 ProfiTap devices before and had some issues with it on Linux, so I was a bit skeptical towards the new ProfiShark 1G as well. In the end, the nice people at Comcraft offered to send me a sample to see how it performed, and I am always happy to get my hands on interesting capture solutions to see how they perform.

Determining frame forwarding latency

In some situations the question arises how much a frame was delayed by a device it has to pass through, e.g. firewalls, loadbalancers and sometimes even routers and switches. Usually, novice network analysts think that for that you need to synchronize the clocks of the capture PCs down to microseconds or even better, but that is not necessary for this kind of reading. It is possible to capture the packets with completely different time settings on the capture PC left and right of the device you need to determine the delay for.

How millisecond delays may kill database performance

Mike, an old buddy of mine is one of the best database application development consultants I have ever met. We worked together for the same company for a couple of years before I got into network analysis and he started his own company. A couple of months ago I found out that there was going to be a conference in my home town where Mike was on the organization team. After a friendly banter on Twitter about him having to come to my city (Düsseldorf; which guys from Cologne like Mike don’t like ;-)) he told me that I should turn in a proposal for a talk. I said I could do that, but not on any database development topic – but maybe a generic network application performance talk might be interesting for those guys attending. So I did, and it got refused, despite Mike advocating for me. Darn.

Well, it’s a nice topic for a blog post nonetheless. So here we go.

The drawbacks of local packet captures

Probably the most common way of capturing network data is not a decision between SPAN or TAP – it is Wireshark simply being installed on one of the computers that need to be analyzed. While this an easy way to capture network packets it is also an easy way to get “wrong” results, because there are a lot of side effects when capturing packets directly on a computer. I discussed a lot of these side effects in my Sharkfest 2013 talk “PA-14: Top 5 False Positives” already, but let’s go check them out again.