Update on Traffic Monitoring and Classification Via XDP/eBPF – GSoC22

The first phase of my GSoC journey focused on get packets level statistics. The works I have done can be described as follow, (1)building the openWrt testbed to run XDP code, (2)following the issues and threads in openWrt forum and community to get familiar with the barriers of running eBPF code on openWrt, (3)leveraging XDP kernel code in official XDP-projects to collect data of wireless traffic(4) implementing my own XDP kernel code and user space loader to collect statistics like throughputs and etc. (5) designing two scenarios, co-channel interference and channel fading, to validate the variation of packets level statistics. The following sections describe my explorations in details.(All the tests are performed on Thinkpad X201i with x86_64 openWrt OS)

XDP Capacity

Like many eBPF developers on openWrt, I encountered lots of barriers when setup a user space loader. It includes big and little endian problems, XDP and eBPF library chaos, architecture related issues and etc. All the problems shown are responsible to our implementation of eBPF/XDP traffic monitoring tool, because we cannot bypass the official XDP support.

Previously, we just have one native XDP loader on openWrt – iproute2. It is indeed a method to load XDP object file, while it would do nothing about user space code, which provide the most convenient method to manipulate different statistics.

When crossing compile the XDP kernel program to BPF object file, the implicit include path of lib headers brings lots of chaos.

Getting the data collected by kernel code

Since we just had official xdp-tools support on openWrt recently. There are two ways for packets-level information collection.

  1. collecting data from /sys/kernel/debug/tracing/trace . Debugging level print is a way to retrieve data collected by kernel programs. Specifically, using bpf_trace_printk helper function. However, it is a little different for XDP kernel program to output information to debugfs. The main reason is that XDP kernel objects are driven by XDP events, that is packets ingress, we are only able to call bpf_trace_printk when the packets come in, which limits the flexibility of statistics poll. So, collecting data from debugfs can be seen a trade-off between official xdp-tools support and collection capacity.

The procedure of this way can be described as follow:

  • crossing compile the XDP kernel program to object file using openWrt SDK or even using host clang
  • uploading the object file to openWrt router and loading it using iproute2 or xdp-loader in xdp-tools package
  • getting data from debugfs and do post-processing
  1. collecting data via user space xdp-loader. Another way to collect XDP kernel data, which is also demonstrated by xdp-project, is to implement user space loader to load the kernel program and get statistics simultaneously. My method is to leverage xdp-tools‘ APIs such as attach_xdp_program provided in util to implement user space xdp-loader. The reason is xdp-tools is not stable ,while porting xdp_load_and_stats.c to openWrt is equivalent to manipulating xdp-tools package which have been done by others. However, I just followed the PR xdp-tools: include staging_dir bpf-headers to fix compiling with sdk by PolynomialDivision · Pull Request #10223 · openwrt/openwrt (github.com) to get my xdp-loader work on x86_64. In user space, packets level data collection related struct is
 struct record {
       __u64 rx_bytes;
       __u64 rx_packets;
       __u64 pps;  //packets per second
   }

The user space loader is like

   static bool load_xdp_stats_program(...) {
       ...
       if (do_load) {
           err = attach_xdp_program(prog, &opt->iface, opt->mode, pin_root_path);
       }
       ...
       if (do_unload) {
           err = detach_xdp_program(prog, &opt->iface, mode, pin_root_path);
       }
       ...
       stats_poll();
   }

As mentioned above, we have tried two ways to build up an entire system running XDP code to collect packets level information. Method (1) is more of a hacky approach while method (2) is still in progress.

Dedicated use cases

Due to the rate adaption mechanism in mac80211 subsystem, there are many situations that will affect the wireless transmission link, thereby affecting the throughput of the wireless network cards, which is reflected in the number of packets and bytes received and sent. We designed two scenarios, co-channel interference and channel fading, both related to packets level statistics, which could be our future classification samples.

For co-channel interference scenarios, we have two routers corresponding to two laptops. In the scenario without co-channel interference, router A is set to channel 1, the laptop C and router A form a wireless link using iperf3. At this point, the router B and laptop D are powered off. In the scenario with co-channel interference, the router B is set to channel 3, which brings spectral-overlap.

For channel fading scenarios, we use a pair of receiver and transmitter, then change the distance between them to validate variation of throughput, and also put some occluders on the line of sight to change the transmission state of the wireless link.

Conclusion

It has to be said that running eBPF code on openWrt is really a unique experience, and I also saw the efforts to official eBPF support in openWrt community. It is a tough journey to try different user space loader and build up a monitoring environment. Since I spent lots of time setting up my first eBPF environment and running the entire XDP program on my openWrt PC. The schedule for the next phase will be tight.

Next phase of GSoC is clear:

  1. There are some other information that xdp hook not supported like signal strength , SNR and etc. I will explore eBPF capacity to get such data
  2. We still have no bug free XDP support on openWrt so far. I will participate in community to do related work

Thanks for reading!

Traffic Monitoring and Classification via XDP/eBPF – GSoC2022

Hi everyone!

I’m Qijia Zheng, my major is Cyberspace Security, a graduate student now studying in University of Science and Technology of China.

About me

My name is Qijia Zheng. Recently, I have conducted research in the field of wireless sensing. It is a interdisciplinary subject of wireless communication and computer vision(deep learning). Now I come to Freifunk to learn the knowledge of wifi to gain a deeper understanding of wifi communication. Also, I was attracted by the concepts of P2P and ad-hoc when I began to learn computer network. I am glad to see and be a part of such a community who brings them to reality, not just on the Internet.

About the project

Why XDP/eBPF?

eBPF can be seen as a way to bypass most kernel stacks like network stack, I/O stack and etc. Using eBPF to get information in kernel could bring less overhead and run customized programs. Also, existing eBPF projects like BCC, bpftrace provide a higher-level framework to write eBPF code, which make life easier.

What I have done?

With my mentors’ advises, I got my hands dirty with the project from a survey.

After surveying tens of wifi analysis projects, I got a list of wifi statistics and created a repo to record the survey.

I evaluated XDP and BCC on my x86 laptop equipped with AX200 NIC, but I encountered lots of barriers when implementing it.(see next section for detailed)

In addition to do a survey of some open source projects, I checked some papers about Rate adaption for mac802.11 wireless networks. The statistics of wifi traffic for transmission rate control could be classified into three groups ACK, SNR and BER, specifically they are packet loss ratio, transmission time, frame receptions, SNR, bit errors and etc.

As I got nothing from mac80211 based on AX200, I focused on collecting statistics from higher layers which basically provided by BCC and XDP project.

Official XDP repo provides a xdp_loader program. By leveraging xdp_loader, I succeded to load my XDP program on my wireless adapter to get the receiving bytes per second and other basic statistics of ethernet frames.

About MAC80211 subsystem

For some reasons, I just test XDP programs on AX200 NIC.

Obviously, we don’t have native XDP support for the driver of AX200.

I put my attention on generic XDP. After runing XDP-TCPDUMP tool provided by XDP-PROJECT repo, I got a “.pcap” file containing packets of wireless traffic captured by the tool. And the payload recorded in the file are all from struct xdp_md *ctx. However, the packets show nothing about IEEE80211 header even the wireless NIC working at monitor mode, which probably means AX200 NIC removed IEEE80211 header before XDP staff getting involved. As a result, we will not get any information about mac80211 header by implementing generic XDP on AX200.

Also, I tried bare tracing tool like kprobe, function graph to trace functions of mac80211 like ieee80211_xmit. Unluckily, I still got no output in TRACE file. That is to say, running eBPF programs attaching to kprobe to on AX200 is infeasible. But we still have some tracepoints of cfg80211.

Next step
  • Switch AX200 to Atheros series to evaluate XDP on ath9k driver to unlock the potential of eBPF attaching to dynamic tracing
  • Keep surveying papers related to transmission rate control topic
  • Select proper machine learning algorithm to implement features and measure the model based on Precision, Recall and F1 score
Goals of the project
  • By leveraging the statistics collected using eBPF, we could create a fingerprint for STAs to identify different types of devices.
  • Implementing a dedicated use case that demonstrates the benefits of eBPF/XDP.
  • Providing the foundation for further research in the direction of network research.
Finally

I would like to thank my mentors for offering the helps for me till now.