Determining network protocols

If you spent enough time using Wireshark or any other network analysis tool, you’ll sooner or later be able to even read bare hex dumps of packets, at least partially (it’s a little bit like Neo seeing the Matrix). So maybe you run across a text dump of a packet like this one:

0000  00 0d b9 21 95 18 c8 60 00 16 7c cc 08 00 45 00   ...!...`..|...E.
0010  00 34 6b 8a 40 00 80 06 00 00 c0 a8 7c 64 51 d1   .4k.@.......|dQ.
0020  b3 45 c4 60 00 50 19 00 52 e7 00 00 00 00 80 02   .E.`.P..R.......
0030  20 00 42 4a 00 00 02 04 05 b4 01 03 03 02 01 01    .BJ............
0040  04 02

Starting of easy

After doing it often enough you’ll see a couple of values that tell you what you’re dealing with almost right away. One of the first things to look for is if you can spot a frequently used Ethertype, e.g. for IPv4, IPv6, or ARP. IPv4 has Ethertype 0x0800, ARP has 0x0806, and IPv6 has 0x86dd (be careful not to spell this “eighty-six double D” in a network analysis class, or students will start to giggle 🙂 ). So let’s see – in our example above you can find one of them real easy – skip the first 12 bytes (destination MAC and source MAC, both 6 bytes), and you’ll see the Ethertype:

0000  00 0d b9 21 95 18 c8 60 00 16 7c cc 08 00 45 00   ...!...`..|...E.
0010  00 34 6b 8a 40 00 80 06 00 00 c0 a8 7c 64 51 d1   .4k.@.......|dQ.
0020  b3 45 c4 60 00 50 19 00 52 e7 00 00 00 00 80 02   .E.`.P..R.......
0030  20 00 42 4a 00 00 02 04 05 b4 01 03 03 02 01 01    .BJ............
0040  04 02

So it’s an IPv4 packet. The next value we see after the Ethertype is 0x45, which again tells us that this packet is IPv4. Why? Let’s take a look at the number 0x45. If you write it in binary, it’s 0100 0101. I often use the calculator in Windows in programmer mode (activate it with Alt + 3) to convert between Dec, Hex, Oct and Bin, especially while working on TraceWrangler. It’s really useful, because it has the binary view active all the time, so you can just enter a number in any of the modes and see the binary representation right away (up to 64 bits, starting from 0 at the lower right):

calculator

Now, there’s a little trick not everybody knows, which is that the single digits of hex values are represented by groups of 4 bits, called a “nibble“. If we take our number 0x45, it has two nibbles: 0100 and 0101. If you convert 0100 to hex, it’s “4”, and 0101 is “5”.Easy, right?

The first nibble tells us the IP version. Why? Because the definition of the IPv4 header structure in RFC 791 tells us that those four bits are for the IP version. The second nibble tells how big the IP header is, just not in bytes, but in 32 bit values (a.k.a. 4 bytes). So 5 * 4 = 20, which is the most common IPv4 header size (when there are no options in use, which in these days, they most likely aren’t). It’s specified in 32 bit values, because the total header must be 32 bit aligned for faster processing purposes anyway. Plus, 4 bits would only allow specifying values up to 15 bytes otherwise, which is not enough.

Okay, now we know we have Ethernet, IPv4 (with a header size of 20 bytes), so the next thing is probably either ICMP, UDP or TCP. Of course there are many other protocols that run on IP, but they are rare compared to the other three. SCTP is probably the one exception you’ll see sometimes. Okay, which one is it? For that we need to know where to look, because the IP header has a value for that (check the structure in the IPv4 RFC if you want):

0000  00 0d b9 21 95 18 c8 60 00 16 7c cc 08 00 45 00   ...!...`..|...E.
0010  00 34 6b 8a 40 00 80 06 00 00 c0 a8 7c 64 51 d1   .4k.@.......|dQ.
0020  b3 45 c4 60 00 50 19 00 52 e7 00 00 00 00 80 02   .E.`.P..R.......
0030  20 00 42 4a 00 00 02 04 05 b4 01 03 03 02 01 01    .BJ............
0040  04 02

It’s the 8th byte in the second line, and in this case, the protocol number is 6, which is TCP (1 for ICMP, 17 for UDP which is 0x11 in hex). By the way, the full list of protocols is available at IANA.

The problem beyond layer 4

Now, the problem with reading network packets in pure hex becomes much more complicated after layer 4 (except for ICMP), and the reason for that is that other than with the lower layers there is no guarantee anymore of what you’ll find. If you look at the words (a word is a two byte value) of the TCP ports in the hex view you’ll see 0xc460 and 0x0050:

0000  00 0d b9 21 95 18 c8 60 00 16 7c cc 08 00 45 00   ...!...`..|...E.
0010  00 34 6b 8a 40 00 80 06 00 00 c0 a8 7c 64 51 d1   .4k.@.......|dQ.
0020  b3 45 c4 60 00 50 19 00 52 e7 00 00 00 00 80 02   .E.`.P..R.......
0030  20 00 42 4a 00 00 02 04 05 b4 01 03 03 02 01 01    .BJ............
0040  04 02

The first value 0xc460 (decimal 50272) is the source port, while 0x0050 (decimal 80) is the destination port. Now, can you tell the protocol running on top of TCP? You may be inclined to say “sure, it’s HTTP” – but is it really? There is no guarantee that a protocol running on port 80 is HTTP – it could be anything. Sure, 80 is the well known port for HTTP, but again: this doesn’t mean that a server administrator could use that port for any application and any protocol he likes. Somebody could be running a game server on that port just because he knows that the firewall in front of it will allow access from the outside world by just looking at the port number (and assuming, wrongfully, that this must be HTTP, so it’s okay).

So if you ask yourself how Wireshark knows when to decode something as HTTP, take a look at the HTTP protocol preferences settings:

WiresharkHTTPSettings

Of course port 80 is listed, but also 3128 (default port used by the Squid proxy), 8080 (a common proxy port as well), and others. So if you have a port for HTTP traffic that isn’t listed, you can add it and Wireshark should detect it right away. By the way, Wireshark also has some kind of heuristic to detect HTTP, because even if you remove port 1900 (the dreaded Simple Service Discovery Protocol annoying me quite often when filtering for “http”) it will still be shown when you filter on “http”. As soon as Wireshark sees certain values that are typical for HTTP it will assume that the conversation is in fact HTTP, even if neither port is listed in the HTTP preferences:

WiresharkHTTPHeuristics

Forcing protocol decodes in Wireshark

If Wireshark doesn’t decode a protocol properly, it’s often because it doesn’t know what protocol it is running on that port. In the following screen shot you see that it doesn’t decode the packets as MySQL because it is running on port 80 instead of the standard port 3306:

WiresharkProtocolUnknown

You can force Wireshark to decode a specific port as a protocol by selecting a packet of that conversation and using “Decode As”:

WiresharkDecodeAs

This will lead you to a dialog where you can select the protocol you know (or think) is the correct one:

WiresharkDecodeAsMySQL

And after applying the setting, you’ll see this decode instead of the old one:

WiresharkProtocolDecodeForced

Permanent change of ports

Sometimes you want to make the change permanent, so you don’t have to use “Decode as” over and over again. You can do that saving your “Decode as” settings, via the “Analyze” menu:

WiresharkUserSpecifiedDecodesMenu

When you use the “Save” button, Wireshark will remember the port setting next time you start it.

WiresharkUserSpecifiedDecodesDialog

There are two problems with that: first, now every packet on port 80 will be decoded as MySQL (including real HTTP packets, which will fail). Second, when sending the capture file to someone else you may want the capture to be decoded without that process right away. To solve both problems there is another neat trick, this time using the sanitization feature of TraceWrangler: simply create a sanitization task that replaces just the port number with one that Wireshark recognizes, like this (again, 3306 being the standard port for MySQL):

TraceWranglerReplacePort

and you get a new file like this that works without any “Decode As” manipulation:

WiresharkTraceWranglerPortReplaced

Happy decoding 🙂

Discussions — 4 Responses

  • Vladislav December 4, 2015 on 12:11 am

    Thank you for such a useful information! It was very helpful for me!

    Reply
  • Glyn Hodges December 15, 2017 on 6:32 pm

    Very nice, concise and useful practical information that has helped me a lot today.

    Thankyou very much Jasper.

    Reply
  • Sam Son May 22, 2018 on 2:28 pm

    I am developing a simple packet capturing program with libpcap as an course assignment, and this post is really helpful! Is there any other way to guess application-level protocol from ip pacekt?

    Reply

*