thePacketGeek

a developing networker

Differences between PyShark 0.3.3 and Documentation

08 Nov 2014 » Coding

While working with Pyshark I’ve found that some of the documentation doesn’t quite line up so I’m writing this post to help people that might run into the same situation. The intro doc is found here and I’ll be comparing it to what actually happens when using the newest version (0.3.3) of Pyshark.

The first thing I noticed is that it’s difficult to get a basic count of the packets in the capture object. The __repr__ string doesn’t include a packet count when reading from a file (only available when using LiveCapture):

pyshark1

>>> cap = pyshark.FileCapture('test.pcap')
>>> cap
   <FileCapture test.pcap>
>>>

And checking the len() of cap tells me that the packets are only read in when requested (lazy fetching):

>>> cap[10]
   <UDP/HTTP Packet>
>>> len(cap)
   11
>>>

But, I think this was changed in order to improve performance since I see references to a lazy parameter in earlier commits, and there is also a current option to not keep the packets in memory (in the cap._packets list). When this option is used, you can only iterate through the packets and not reference them by index:

pyshark2

>>> cap = pyshark.FileCapture('test.pcap', keep_packets=False)
>>> len(cap)
   0
>>> cap[10]
... (truncated)
    NotImplementedError: Cannot use getitem if packets are not kept
>>> for pkt in cap:
....:     print pkt.highest_layer
....:
HTTP
HTTP
HTTP
... (truncated)