Capturing damaged frames
One of the questions that I often got in my network analysis classes was how to capture damaged frames. It is an obvious thing to ask, since frames with bad checksums will most certainly have to be retransmitted or are at least a nice indicator that something went wrong while transporting the frame.
The answer to that question has changed since the days I was working as a sales engineer selling Sniffer Pro and Sniffer Distributed by Network Associates (which was split into McAfee and Network General again, the latter of which is now owned by NetScout). Back in the days of hubs Sniffer had a selling point in providing special network card drivers that could make the capture card accept damaged frames. Which is something “normal” network cards do not do. So back then (2003-2005) I usually said “sure, get the Sniffer Pro or Sniffer Distributed, install the special Xircom/Adaptec 620xx drivers, and you can pick up those damaged frames”.
Well, today, the story is a little different: Xircom doesn’t exist anymore (they’ve got bought by Intel), and Adaptec doesn’t do network cards any longer (as far as I can tell). Also, who’s going to buy a Sniffer Pro laptop software when there’s Wireshark for free?
Anyway, back to the topic of capturing packets with CRC errors. The bad news is: you can’t, unless you’ve got special equipment – a normal PC/MAC won’t do. But the good news is: you don’t need special equipment in most cases anyway because the damaged frames will most likely never make it to the capture device in the first place. The reason for that lies in the way most switches work these days: they wait until the full frame was received on the incoming port before sending it back out on the outgoing port. That mode is called “Store and Forward” and has the advantage of the switch being able to check the Ethernet checksum (also called “Frame Check Sequence” or “FCS”) and avoid sending damaged frames any further. So if there is a damaged frame coming in it will be dropped, and the analyzer will never get it because it doesn’t make it to the cable leading to the capture device.
So far, so good. Now let’s assume the frame gets damaged on the outgoing port towards the analyzer but was fine before – guess what? You’ll still not be able to capture it, because as soon as the frame arrives at the network card of your PC/MAC the card will realize that it is damaged and discard it on its own. It will not forward the frame to the operating system, and that means Wireshark (or any other software you run) will never even see it. Which doesn’t matter that much anyway, because if the frame gets damaged only on the last cable towards the analyzer you’d capture something with no relevance to the real network (except for the fact that the switch port seems to be damaged). This is where Sniffer (as well as other commercial sniffers) have/had an advantage, because with its customized drivers it could force the card to pass the damaged frame up to the OS. Which made a lot of sense back when we used hubs in our networks, but not so much today where we use switches.
Now, does this all mean that everything is lost unless you have special capture equipment? The answer is no. First of all, if you suspect damaged frames you should not try to capture the frames, because that’s way too complicated. Instead, check the switches for errors on their port counters – that’s much easier. Network analysis doesn’t always mean that you have to run Wireshark (or any other packet analysis tool) – it means you need to determine the problem as fast as possible, with any tool you have. In this case, the switch port statistics are much more useful than packet captures, e.g. on a Cisco device (in this case with just one CRC error out of 17.798.612 frames):
Switch#sh int gi0/12 GigabitEthernet0/12 is up, line protocol is up (connected) Hardware is Gigabit Ethernet, address is 6073.5cb1.d10d (bia 6073.5cb1.d10d) Description: LabSw1Port12 MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input never, output 00:00:01, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 4000 bits/sec, 2 packets/sec 5 minute output rate 3162000 bits/sec, 756 packets/sec 17798612 packets input, 4621907918 bytes, 0 no buffer Received 9763081 broadcasts (9755288 multicasts) 0 runts, 0 giants, 0 throttles 1 input errors, 1 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 9755288 multicast, 0 pause input 0 input packets with dribble condition detected 1322271091 packets output, 590272371495 bytes, 0 underruns
So no, I would not replace this switch, as it is doing fine 😉 Another example (a HP 2848 switch, since I’m pretty much vendor neutral):
ProCurve Switch 2848# sh int 11 Status and Counters - Port Counters for port 11 Name : LabSw2Port11 Link Status : Up Totals (Since boot or last clear) : Bytes Rx : 979,213,062 Bytes Tx : 4,238,768,731 Unicast Rx : 3,386,949,054 Unicast Tx : 1,700,237,211 Bcast/Mcast Rx : 11,366 Bcast/Mcast Tx : 23,972,752 Errors (Since boot or last clear) : FCS Rx : 142 Drops Rx : 0 Alignment Rx : 0 Collisions Tx : 0 Runts Rx : 0 Late Colln Tx : 0 Giants Rx : 0 Excessive Colln : 0 Total Rx Errors : 154 Deferred Tx : 0 Rates (5 minute weighted average) : Total Rx (bps) : 5370608 Total Tx (bps) : 5605024 Unicast Rx (Pkts/sec) : 27 Unicast Tx (Pkts/sec) : 53 B/Mcast Rx (Pkts/sec) : 0 B/Mcast Tx (Pkts/sec) : 2 Utilization Rx : 00.53 % Utilization Tx : 00.56 %
Nothing much to worry about here, either, even if there are a couple or CRC errors, but the switch is running for half a year or more by now, so those numbers are still pretty small.
By the way, Wireshark (as most other network analyzers) does not keep the Ethernet FCS in the trace file – the frame must have arrived with a good FCS anyway, because otherwise Wireshark/dumpcap wouldn’t have received it at all. But if you have a device that does keep the FCS and writes it into the capture file this is how it would look like in Wireshark:
Looks like Wireshark calls the FCS a “Trailer” in this case. In the decode, these bytes are if fact the last four bytes in the frame, Wireshark just shows them as part of the Ethernet header.
You can make wireshark interpret the “trailing” bytes as FCS field by enabling the ethernet protocol preference “Assume Packets Have FCS”.
Thanks, Sake, I didn’t check that setting, it was kinda late 🙂
that was informative. we have a similar issue, and we need to capture the corrupted frames. The device (cisco) is connected to a switch, and we running span. But the switch is dropping these frames/packets before being forwarded to span. We thought of connecting a PC directly to capture these corrupted frames.
But reading this article, i can say, we have no way of reading the corrupted frames. is that the right conclusion?
A PC will not be good enough unless it is equipped with a specialized capture card (e.g. Riverbed TurboCAP, Napatech, Fiberblaze). Standard network cards do not help to capture corrupted frames, correct. You’ll also be needing a full duplex TAP to send the damaged frames to the capture card.
*Linux, Wireshark, Realtek 8169 with rx-fcs, rx-all option enabled is all you need to capture damaged frames, no need for for any overpriced equipment. Works on ANY PC.
Hi Oliver, interesting, I didn’t know that was possible – I’ll have to try that one myself. Thanks!
We consider inserting a “TAP” into RGMII interfacce. To do that, we consider splitting those signals and route them to external PHY. One feature we want to create is to being able to capture Error Frames( CRC Errors).
I was searching whole internet and could not find any info on following:
If a PHY device gets packets with FCS errors will they forward the Packets or will they drop them. I guess PHYs are no “store and forward”, so I assume they will forward them. Some of the Marvell PHYs count CRC errors to be informative for users but couldn’t found a word whether they will drop the frames or send them further.
You did review the Profitap, we think to feed the Profitap.
So, the idea is:
1. Use 2 PHYs to capture full duplex RGMII (keep the signal lengths up to PHYs similar to not to influence Packets sequence order on full duplex RGMII)
2. Connect the PHYs to Profitap.
3. Analyse on PC connected to Profitap over usb3.0
Would this work? 🙂
Sorry, somehow your comment got caught up in the SPAM queue and it had to rescue it first.
I haven’t worked with RGMII interfaces myself, but a PHY normally doesn’t drop error packets – that’s usually a functionality implemented in the NIC drivers. So my expectation would be that your setup should work, but you’ll only really know when you try 🙂