Intro to PyShark
for Programmatic Packet Analysis
I can hardly believe it took me this long to find PyShark, but I am very glad I did! PyShark is a wrapper for the Wireshark CLI interface, tshark, so all of the Wireshark decoders are available to PyShark! It is so amazing that I started a new project just so I could use this amazing new tool: Cloud-Pcap.
You can use PyShark to sniff from a interface or open a saved capture file, as the docs show on the overview page here:
import pyshark # Open saved trace file cap = pyshark.FileCapture('/tmp/mycapture.cap') # Sniff from interface capture = pyshark.LiveCapture(interface='eth0') capture.sniff(timeout=10) <LiveCapture (5 packets)>
Once a capture object is created, either from a LiveCapture or FileCapture method, several methods and attributes are available at both the capture and packet level. The power of PyShark is the access to all of the packet decoders built into tshark. I'm going to just give a sneak peek of some of the things you can do in this post and there will be a few accompanying posts that follow to go more in depth.
- Getting packet summaries (similar to tshark capture output):
>>> for pkt in cap: ...: print pkt ...: 2 0.512323 0.512323 fe80::f141:48a9:9a2c:73e5 ff02::c SSDP 208 M-SEARCH * HTTP/ 3 1.331469 0.819146 fe80::159a:5c9f:529c:f1eb ff02::c SSDP 208 M-SEARCH * HTTP/ 4 2.093188 0.761719 192.168.1.1 18.104.22.168 SSDP 395 NOTIFY * HTTP/1. 0x0000 (0) 5 2.096287 0.003099 192.168.1.1 22.214.171.124 SSDP 332 NOTIFY * HTTP/1. 0x0000 (0)
This will give access to attributes like packet number, relative and delta times, IP addresses, protocol, and a brief info line.
- Drilling down into packet attributes by layer:
>>> pkt. #(tab auto-complete) pkt.captured_length pkt.highest_layer pkt.ip pkt.pretty_print pkt.transport_layer pkt.eth pkt.http pkt.layers pkt.sniff_time pkt.udp pkt.frame_info pkt.interface_captured pkt.length pkt.sniff_timestamp >>> >>> pkt[pkt.highest_layer]. #(tab auto-complete) pkt_app. pkt_app.get_field_value pkt_app.raw_mode pkt_app.request_version pkt_app.DATA_LAYER pkt_app.get_raw_value pkt_app.request pkt_app.chat pkt_app.layer_name pkt_app.request_method pkt_app.get_field pkt_app.pretty_print pkt_app.request_uri
- Iterating through the packets and applying a function to each:
>>> cap = pyshark.FileCapture('test.pcap', keep_packets=False) >>> def print_highest_layer(pkt) ...: print pkt.highest_layer >>> cap.apply_on_packets(print_highest_layer) HTTP HTTP HTTP HTTP HTTP ... (truncated)
and this is just the sneak peak!! Who knew that the getting the power of tshark & Wireshark in your python scripts and applications would be this easy! The only caveat that I've found so far is the performance. I've thrown a lot of packets at PyShark and it can really slow down once you start running through captures of a couple thousand packets. Some things have been done to preserve memory that will be covered in the following posts.
I certainly hope you're as excited as I am at this point. There's plenty more to come, so check back soon!