Capturing network packets is the first step in any kind of network analysis or network forensics situation. Few people ever consider this an important step, but this is really where the analysis result can be heavily distorted if you’re not careful. During Sharkfest 2016 I talked about how important the capture process and it’s preparations are, and decided to start a series of blog posts about how to do network packet captures. So here we go with the first one, starting with basic network capturing in a wired environment.
First Things First
Capturing network packets is something you do for a number of reasons:
- Checking out what a certain program does on the network
- Troubleshooting connectivity problems, e.g. refused or aborted TCP sessions
- Diagnosing slow performance of interactive or bulk transfer applications
- Network Forensics, building a timeline of events happening on the network
- Deep Packet Inspection, looking for known attack patterns (“Indicators of Compromise”)
- Building flow data records (Metadata, NetFlow)
- Reverse engineering network protocols
- Reconstruction & extraction of files transported over the network
- Stealing credentials or other security tokens (if you’re looking for this, check this post)
As you can see, there’s a wide range of topics here. The key point to ask yourself when capturing network packets is always this:
How precise do you need your capture to be?
When you don’t care about precision, you can consider capturing packets on a client PC/MAC, a server, or any other device that is part of the communication. That’s what I call a “local capture”, because it picks up the packets on the device as they are created or passing by, and it has two important aspects:
- it’s very very easy to do (you only need capture software installed)
- it messes up the capture results worse than any other capture setup by picking up distorted results
If you can live with the distorted packets, do a local capture. If you can’t, and you need your capture to be precise (which is especially true for most troubleshooting cases where timings or connection aborts are involved), you need to do something better. It boils down to having a dedicated capture device at hand, that is completely passive (meaning: not creating/sending any packets itself) and which records all the packets it sees on its network capture card with the highest precision possible. So we’ll look at capture setups using a dedicated capture device for now.
Remember the days when Ethernet was just one of the layer 2 network protocols we had to connect computers, servers and other nodes to each other? Some of you may even still remember the Thick Yellow Cable you had to literally drill into to connect new nodes to the network using a “Vampire Tap“:
Or that 10Base2 coaxial cable with the T-connectors and the cable end terminators? Yeah, life was simple back then – well, not really: the network was pretty unreliable, the terminators failed all the time (at least it felt like they were), and if you had 5 hours for a Friday night LAN party it took at least 3 hours to get the network up and running (having “known good” spare terminator resistors was vital back then ;-) ).
But: capturing network packets was real easy it those days. Why? Because CSMA-CD (is it weird I still can tell that this stands for “Carrier Sense Multiple Access with Collision Detection” from the back of my head?). So what does that mean? It means: everyone on the network sees all the packets there are. It’s a shared medium – if anyone sends a packet, everybody else can read it while it travels down the cable:
As you can see, the network card of the “Capture Device” will see everything that happens on the network, as will any other device present and running.
Connecting everything to one long cable with T-Connectors got more and more inconvenient over time, especially since it wasn’t easy to add new nodes to an existing cable quickly. So instead of having to open the cable, inserting a new T-Connector and connecting everything again, so called “hubs” were introduced. They’re basically boxes with ports where you can plug nodes in and out at any time. But it was still the same game: hubs simple sent all packets entering the hub device out on all ports. Network capture heaven – and sometimes you can still find hubs still being used years after they should have been replaced by switches:
Still powered on at my son's school. pic.twitter.com/vl1VmBsrEl
— Matthew Norwood (@matthewnorwood) February 6, 2016
Now, the problem with hubs is simple: in a shared medium, collisions occur when two nodes try to talk at the same time (think Walkie Talkie: only one can press the talk button or nobody will hear anything). That’s what’s called “half duplex” – listen, or talk, but not both.
Shared Ethernet got into trouble when more and more nodes were attached to the network, and more and more packets were being sent. Collision would happen so often that it really hurt performance – some say, 30-40% total utilization was the maximum you could get on a shared medium. BTW, Token Ring had much higher performance, but it wasn’t a free & cheap standard like Ethernet, so guess who survived? ;-)
Anyway, capturing packets in the days of hubs was easy, but it also meant that the network analyst had to deal with problems caused by the shared medium: “physical” errors. A collision was something that could happen at any time, and it meant that both nodes trying to send a packet had to back off, wait a random time, and try again – so the rate of collisions on a network was a factor to be considered for network performance. And you could pickup funny packets like this one:
You can see the destination MAC starting with ca:5e:00:50 followed by a lot of 0x55 values. A series of hexadecimal 55 looks like this in binary:
The binary pattern 0101010101[…] is an Ethernet preamble which is used to signal “Hey, I want to send a packet” and in the decode shown above in Figure 2 it collided into a packet that was already starting – so you can literally see a collision that happened while the packet was being captured.
Moar Speeeed! with Switches
Now, to improve Ethernet performance, switches were created. And when hubs are like Walkie Talkies, switches are like telephones: you can talk and listen at the same time (well, technically. In reality, men usually can’t. Women often can. They do have better multitasking features I guess ;-) ). Being able to send and receive at the same time is called “full duplex“.
To be able to allow full duplex communication, switches act like a telephone switchboard operated by persons connecting calls: if a packet enters the switch, it will be transported only to the destination port, while all other ports will never see the packet (including a packet capture device wondering what’s going on), just like a call is only routed to the person being called:
A network capture device on a switch will not get any packets belonging to a conversation (“Unicast”) between two nodes on that switch, or between nodes on other switches that will pass the switch the device is connected to. And that’s simply because the capture device is not the real target for the packets. The “Target” is specified by the destination MAC address, in case you’re wondering, not by IP address or anything else – switches are layer 2 devices, which means they basically work with MAC addresses. So the capture device will only receive packets that are sent to all nodes (“Broadcasts”) and maybe some packets that are sent to groups of nodes (“Multicast”), depending on how the switch handles Multicast packets (I’m not going into Multicast aware switching at this point). And the reason for that is that the switch has an “inventory” of all nodes connected to its ports, so it knows where to send Unicast packets to.
Using switches instead of hubs basically means that network analysts lost their easy access to all packets on the network, and it won’t ever be that way again – so anyone asking for things like “I want to capture all packets in our network” – there’s just no way to do that anymore (well, realistically. You can try to get close to something like a full scale network capture, but it’ll most likely kill your network productivity big time).
Let’s have some fun: Pop quiz
Question 1: how many Unicast packets will a capture device record on a switch like the one in Figure 5 if it keeps capturing for a while?
Wrong. It will pick up single Unicast packets every once in a while, but just one per connection. The reason for that is that the switch removes entries from it’s location inventory after a certain duration and learn its location again. This feature is important to allow nodes to be unplugged and moved to a different port, because otherwise packets would end up at the old port and never reach the node at the new port. To learn about which port a node is connected the switch forwards any packet it doesn’t know the destination port for to all ports. This is also called “port flooding” or “MAC flooding“. The switch hopes that the destination is connected at one of the ports, and when an answer packets arrives on one of them, it adds that port to the inventory again. All further packets after the flooded packets are forwarded to the port it just learned, and the capture device will not see any of them until the next flood.
Question 2: So what happens if packets are sent to a destination that doesn’t exist or never answers?
Well, the switch cannot learn the location, and keeps flooding the packets.
Question 3: What if a packet is only passing through another switch I’m not even connected to, can I capture that at all?
No. The packets will not come and visit the switch you’re connected to just to be picked up. You need to capture where the packets are passing through. This is like watching cars in traffic on a street from a window of a building – if you don’t have a line of sight to the street they’re on you can’t see them. To be honest: We will see later that there are some solutions to this problem, but in my opinion most of them aren’t pretty (read: they hurt capture precision, and I don’t like that).
Roger. Copy that!
So, how do we get network packets when we have to deal with switches? Well, we need to tell the switch to send copies of all packets we want to the port where the capture device is connected:
For a switch to be able to copy packets like that it needs to have a feature usually called “SPAN” or “Port Mirror”. Without it, you can’t capture Unicast packets except flooded packets. To configure a SPAN or Mirror session you need to be able to access your switch via SSH (if it only has Telnet,
shoot patch/replace it) or HTTPS (again, if it only has HTTP…). For example, one of my HP switches has a web front end only (but at least there is one), which I can use to configure a mirror session:
On my Cisco switch that does have a command line, I prefer using that to create a monitor session:
Switch(config)#monitor session 1 source interface gigabitEthernet 1/7 both Switch(config)#monitor session 1 destination interface gigabitEthernet 1/24
Switches and SPAN ports
There are four types of switches when it comes to SPAN/port mirroring:
- “Dumb” switches that do not have a feature like that. They are usually very cheap and found in consumer stores. When it comes to network captures those switches are showstoppers: you can’t capture packets with them in any useful kind of way.
- “Old managed” switches, which can be configured (usually using Telnet, *argh*), which do not have a SPAN/mirror feature – I’ve seen those at hospitals sometimes (3Com devices), and yes, it was in 2009. My advice: replace at the earliest.
- “Managed” switches, that can be configured via web browser, SSH or any other protocol. Those cost more than dumb switches, depending on their feature set. My recommendation is to always consider buying managed switches.
- “Hard wired mirror port” switches. Those are switches that can’t be configured via software but come with a hard wired mirroring feature, e.g. copying all packets entering and leaving port 1 to port 5.
If you’re just trying to capture packets in a home environment I usually recommend one of these web manageable portable switches, which can be inserted into a link you want to capture (or use them as permanent network devices if the port count is enough for your needs, of course):
- Cisco SG200 8 Port Web Managed Gigabit switch
- NETGEAR ProSAFE GS105Ev2 5 Port Gigabit Web Managed Switch
Or, if you’re looking for a cheap hard wired mirroring switch for spontaneous captures without having to configure anything (and being USB powered, so no “Power Socket Search”tm either):
The other way of capturing network packets is to use a TAP. TAPs are professional devices that are inserted into a physical link, providing an additional output you can connect your capture device to to see everything happening on the link. TAPs have the big advantage of being much more precise than SPAN/mirror capture setups, but they come with two disadvantages:
- They cost money, while the switch you already often allows SPAN/mirror sessions “for free” (well, it’s part of the switch price tag, so it’s not really “free”)
- You have to disconnect the link to insert the TAP, and disconnect it again if you want to pull the TAP back out at some point. This isn’t a big deal for a user PC connection, but often quite complicated to do on a backbone link that has to stay connected 24/7.
I will cover TAPs and their features/advantages in at least one later network capture blog post, so let’s leave it at that for now (this sentence is important to keep TAP vendors and Tim O’Neill from “ahem“ing me :-) ).
The Hub-Out no-no
There’s is… no, was, a third way to grab packets in switched networks, which was often called “hubbing out” – the idea was to insert a hub into a link (similar to a TAP), and connect a capture device to the hub as well. By putting in the hub the link became a shared medium again, so the capture device would see what happened on the link. I used to do that myself, carrying small USB powered 5-port hubs with me all the time, because SPAN port features were still rare 10-15 years ago. Today, I strongly recommend not to “hub out” anymore:
- it forces the link from full duplex to half duplex, heavily changing your problem scenario, and reintroducing collisions
- if you are unlucky, some device will still try full duplex, leading to a duplex mismatch (coming with a collision ratio like you wouldn’t believe)
- they only work for 10/100 Mbit links
- you can rarely find a hub anymore
- SPAN ports are much more common now, so use them instead
Starting a network packet capture is not complicated, even though it used to be trivial in the days when we still used hubs to connect nodes to a network – everybody got everything. Now, we’re using switches everywhere, so we need port mirrors / SPAN capabilities, or use TAPs. If your switches can’t mirror packets, you’re not going to be successful with packet captures unless you deploy a TAP. We will see in later chapters that some capture setups can get quite complex with many details to consider (especially in virtual environments), but for now, this should be enough.
In a way, capturing packets on a WiFi is similar, because wireless allows anyone to see everything (if in range, of course). But WiFi captures have their own complexities as well, mostly because you’ll have to deal with radio waves, different channels, signal strength and encryption most of the time.
In the next part to come we’ll look at the role of network cards, link speeds and other things that may become relevant in a capture situation.
Other parts of this series