Traffic Monitoring and Classification via XDP/eBPF – GSoC2022

Hi everyone!

I’m Qijia Zheng, my major is Cyberspace Security, a graduate student now studying in University of Science and Technology of China.

About me

My name is Qijia Zheng. Recently, I have conducted research in the field of wireless sensing. It is a interdisciplinary subject of wireless communication and computer vision(deep learning). Now I come to Freifunk to learn the knowledge of wifi to gain a deeper understanding of wifi communication. Also, I was attracted by the concepts of P2P and ad-hoc when I began to learn computer network. I am glad to see and be a part of such a community who brings them to reality, not just on the Internet.

About the project

Why XDP/eBPF?

eBPF can be seen as a way to bypass most kernel stacks like network stack, I/O stack and etc. Using eBPF to get information in kernel could bring less overhead and run customized programs. Also, existing eBPF projects like BCC, bpftrace provide a higher-level framework to write eBPF code, which make life easier.

What I have done?

With my mentors’ advises, I got my hands dirty with the project from a survey.

After surveying tens of wifi analysis projects, I got a list of wifi statistics and created a repo to record the survey.

I evaluated XDP and BCC on my x86 laptop equipped with AX200 NIC, but I encountered lots of barriers when implementing it.(see next section for detailed)

In addition to do a survey of some open source projects, I checked some papers about Rate adaption for mac802.11 wireless networks. The statistics of wifi traffic for transmission rate control could be classified into three groups ACK, SNR and BER, specifically they are packet loss ratio, transmission time, frame receptions, SNR, bit errors and etc.

As I got nothing from mac80211 based on AX200, I focused on collecting statistics from higher layers which basically provided by BCC and XDP project.

Official XDP repo provides a xdp_loader program. By leveraging xdp_loader, I succeded to load my XDP program on my wireless adapter to get the receiving bytes per second and other basic statistics of ethernet frames.

About MAC80211 subsystem

For some reasons, I just test XDP programs on AX200 NIC.

Obviously, we don’t have native XDP support for the driver of AX200.

I put my attention on generic XDP. After runing XDP-TCPDUMP tool provided by XDP-PROJECT repo, I got a “.pcap” file containing packets of wireless traffic captured by the tool. And the payload recorded in the file are all from struct xdp_md *ctx. However, the packets show nothing about IEEE80211 header even the wireless NIC working at monitor mode, which probably means AX200 NIC removed IEEE80211 header before XDP staff getting involved. As a result, we will not get any information about mac80211 header by implementing generic XDP on AX200.

Also, I tried bare tracing tool like kprobe, function graph to trace functions of mac80211 like ieee80211_xmit. Unluckily, I still got no output in TRACE file. That is to say, running eBPF programs attaching to kprobe to on AX200 is infeasible. But we still have some tracepoints of cfg80211.

Next step
  • Switch AX200 to Atheros series to evaluate XDP on ath9k driver to unlock the potential of eBPF attaching to dynamic tracing
  • Keep surveying papers related to transmission rate control topic
  • Select proper machine learning algorithm to implement features and measure the model based on Precision, Recall and F1 score
Goals of the project
  • By leveraging the statistics collected using eBPF, we could create a fingerprint for STAs to identify different types of devices.
  • Implementing a dedicated use case that demonstrates the benefits of eBPF/XDP.
  • Providing the foundation for further research in the direction of network research.
Finally

I would like to thank my mentors for offering the helps for me till now.

2 thoughts on “Traffic Monitoring and Classification via XDP/eBPF – GSoC2022

  1. Hi, that’s a very exciting project.

    Not sure if that’d be in scope or of interest for this project, but I’d be quite interested to know about the feasibility of using XDP or eBPF as a fast and lightweight way to measure the overhead of a routing protocol. Including multicast/broadcast in the case of a layer 2 routing protocol like batman-adv for instance. This could help us a lot to understand, compare and improve the various routing protocols we have, I think.

    I had been working a bit on getting a better view into how much overhead a routing protocol has and especially for batman-adv how much and which layer 2 specific multicast/broadcast overhead we have. Initially I started with some quick and dirty scripts around tshark(=libwireshark) [0]. But that is a bit slow. Lately I had been working a bit on using Lemoer’s bpfcountd [1], which utilizes libpcap / BPF, and made a patch for libpcap to check within a batman-adv packet[2]. Which is already a lot faster. However still has some performance overhead and uses too much RAM for most embedded devices. If on your journey through eBPF/XDP you’d discover some things which could help us for bpfcountd to switch from libpcap to eBPF or XDP, I’d be very interested to hear.

    For our bpfcountd use-case I’d like to feed it a list of hierarchical filters in tcpdump’s packet filter syntax like this [3] (from here[4]). And the bpfcountd should count the number of bytes+packets for each of these filters on all interfaces, without counting/impacting the performance of unicast payload traffic.

    Cheers, T_X

    [0]: https://github.com/T-X/wirerrd/
    [1]: https://github.com/lemoer/bpfcountd/
    [2]: https://github.com/the-tcpdump-group/libpcap/pull/980
    [3]: https://github.com/freifunk-gluon/gluon/blob/0b61595cacd095123cfc9e5cec1a5f942a334c27/package/gluon-statistics-mcast/files/lib/gluon/bpfcountd/mesh.filters
    [4]: https://github.com/freifunk-gluon/gluon/pull/2367

    1. Hi, T_X

      I am happy to know you are interested in our project.
      Evaluating XDP/eBPF on embedded system is also one of goals of this project. eBPF could have a natural advantage in getting rich data from kernel. However, conservatively, I am not going to say XDP/eBPF is one-size-fits-all alternative for all scenarios.
      If possible, I am happy to try XDP/eBPF on batman-adv.
      Your experiences on measuring L2 routing inspiring me a lot. Do not hesitate to post comments. I will reply on this blog when I find something interesting during the project journey.

      Cheers, Qijia

Leave a Reply

Your email address will not be published. Required fields are marked *