Google Summer of Code 2016: External netifd Device Handlers – Milestone 1

OVERVIEW OF THE LAST WEEKS
During the last 5 to 6 weeks I have implemented the possibility to include wireless interfaces in Open vSwitch bridges, rewritten a lot of the code for creating bridges with external device handlers and brought my development environment up to speed. I am now working with an up-to-date copy of the LEDE repository.
I have also implemented the possibility for users of external device handlers to define what information and statistics of their devices look like and to query the external device handler for that data through netifd.

CHANGES TO THE DEVELOPMENT ENVIRONMENT
So far, I have been using quilt to create and manage patches for my alterations of the netifd code.
One day they were completely broken. I still do not know what and how it happened but it was not the first time and this time recovery would have been way too tedious.
This is why I switched to a git-only setup. I now have a clone of the netifd repo on my development machine that is symlinked into the LEDE source tree using the nice ‘enable package source tree override’ option in the main Makefile. I used the oppportunity to update both the LEDE source tree and the netifd repository to the most recent versions.
Before, I was working on an OpenWRT Chaos Calmer tree, because of a bug causing the Open vSwitch package to segfault with more recent kernels.
Now, everything is up-to-date: LEDE, netifd and Open vSwitch.

MY PROGRESS IN DETAIL

More Dynamic Device Creation
In actual coding news, I have refined the callback mechanism for creating bridges with external device handlers and the way they are created and brought up.
Previously, a bridge and its ports were created and activated immediately when /etc/config/network was parsed. Now, the ubus call is postponed until the first port on the bridge is brought up.
Because of the added asynchronicity, I had to add a ‘timeout and retry’-mechanism to keep track of the state of the structures in netifd and the external device handler.

A few questions have come up regarding the device handler interface. As I have explained in my first blog post, I am working on Open vSwitch integration into LEDE writing an external device handler called ovsd. Obviously, this is very useful for testing as well.
I have come across the issue of wanting to disable an bridge without deleting it. This means bringing down the bridge and removing the L2 devices from it. The device handler interface that I mirror for my ubus methods doesn’t really have a method for this. The closest thing is ‘hotplug remove’, which using feels a bit like a dirty hack to me.
I have reached out no netifd’s maintainer about this issue. For the meantime, I stick to (ab)using the hotplug mechanism.

On the ovsd side I have added a pretty central feature: OpenFlow controllers. Obviously, someone who uses Open vSwitch bridges is likely to want to use its SDN capabilities.
Controllers can be configured directly in /etc/config/network with the UCI option ‘ofcontrollers’:

config interface ‘lan’
        option ifname ‘eth0’
        option type ‘Open vSwitch’
        option proto ‘static’
        option ipaddr ‘1.2.3.4’
        option gateway ‘1.2.3.1’
        option netmask ‘255.255.255.0’
        option ofcontrollers ‘tcp:1.2.3.4:5678’
        option controller_fail_mode ‘standalone’

The format in which the controllers are defined is exactly the one the ovs-vsctl command line tool expects.
The other new UCI option below ofcontrollers configures the bridge’s behavior in case the configured controller is unreachable. It is a direct mapping to the ovs-vsctl command ‘set-fail-mode’. The default behavior in case of controller absence is called ‘standalone’ which makes the Open vSwitch behave like a learning switch. ‘secure’ disables the adding of flows if no controller is present.

Function Coverage: Information and Statistics
Netifd device handlers have functions to dump information and statistics about devices in JSON format: ‘dump_info’ and ‘dump_stats’. Usually, these just collect data from the structures in netifd and the kernel but with my external device handlers, it is not as simple. I have to relay the query to an external device handler program and parse the response. Since the interface is generic, I cannot hard-code the fields and types in the response. This is why I relied once more on the JSON data type description mechanism that I have already used for dynamic creation of device configuration descriptions.
In addition to the mandatory ‘config’ description, users can now optionally provide ‘info’ and/or ‘stats’ fields. Just like the configuration descriptions they are stored in the stub device handler structs within netifd where they are available to serve as blueprints for how information and statistics coming from external device handlers have to be parsed.

For my Open vSwitch setup, it currently looks like this in /lib/netifd/ubusdev-config/ovsd.json:
{
    “name” : “Open vSwitch”,
    “ubus_name” : “ovs”,
    “bridge” : “1”,
    “br-prefix” : “ovs”, 
    “config” : [
        [“name”, 3],
        [“ifname”, 1],
        [“empty”, 7],
        [“parent”, 3],
        [“ofcontrollers”, 1],
        [“controller_fail_mode”, 3],
        [“vlan”, 6]
    ],
    “info” : [
        [“ofcontrollers”, 1],
        [“fail_mode”, 3],
        [“ports”, 1]
    ]
}

This is how it looks when I query the Open vSwitch bridge ‘ovs-lan’:

THE NEXT STEPS

During the weeks to come I want to look into some issues which occurred sometimes when I disabled and re-enabled a bridge: Some protocol-realated configuration went missing. This could mean that sometimes the configured IP address was gone. Something which could help me overcome the problem is also in need of some work: reloading/reconfiguring devices.
Along with this, I want to get started with the documentation to prepare for the publication of the source code.

Leave a Reply

Your email address will not be published. Required fields are marked *