Final Report: Simplify LibreMesh and get it closer to OpenWrt

This summer I focused on removing LibreMesh-specific glue where OpenWrt already has a solid upstream solution, and on tightening integration with OpenWrt services:

  • Task 1 — Reboots via OpenWrt’s watchcat: new lime-hwd-watchcat module generates /etc/config/watchcat from LibreMesh UCI and reloads the service automatically. It replaces the custom deferrable-reboot.
  • Task 2 — Move DHCP to odhcpd + cluster-wide lease sharing: prototype package that makes odhcpd the main DHCPv4 server (maindhcp=1) and shares leases across nodes via CRDTs, using ubus and leasetrigger.
  • Task 3 — Remove VLAN wrappers from Babel: PR switches lime-proto-babeld to run on base ifaces and br-lan (DSA), adds a small nftables guard to avoid L2 multicast leaking Babel traffic from bat0 into the bridge.
  • Task 4 — Plan for an L3-only LibreMesh profile: opened an L3-only mesh issue (no batman-adv/anygw) to scope the work and collect feedback.

Project goals

  • Simplify LibreMesh by adopting OpenWrt-native components when they are equivalent or better.
  • Reduce maintenance burden in LibreMesh packages by deleting code where upstream covers it.
  • Keep behavior observable & testable with CI/unit tests and simple field tests on real routers.

What was built

1) Integrate OpenWrt’s watchcat via LibreMesh HWD

Why: OpenWrt ships watchcat for scheduled reboots/network checks. Using it upstream avoids reboot logic and gains a familiar UI.

What: lime-hwd-watchcat (Lua) reads LibreMesh UCI (config hwd_watchcat), generates the corresponding /etc/config/watchcat sections, cleans stale ones, and reloads the service. The package lives in lime-packages and targets stock OpenWrt watchcat.

Pull request of this task

Code of this task

Result: Reboots and connectivity checks are handled by upstream watchcat, visible in LuCI, with configuration owned by LibreMesh.

2) Replace dnsmasq DHCP with odhcpd

Why: odhcpd is OpenWrt’s native DHCP/RA daemon, well-integrated with ubus, and already authoritative for IPv6; enabling it for IPv4 simplifies the stack.

What: prototype package shared-state-odhcpd_leases:

  • UCI defaults set dhcp.odhcpd.maindhcp=1, point leasetrigger to our publisher, and register a community scoped CRDT.
  • Publisher: on lease change, call ubus dhcp ipv4leases, distill to IP → {mac, hostname}, publish to CRDT.
  • Generator: on CRDT updates, atomically write /tmp/ethers.mesh and reload odhcpd so foreign leases are served locally too.
    (“Leasetrigger” behavior is an odhcpd feature used here to emit on lease changes.)

In the field: two LibreMesh nodes converge on the same lease table within seconds; uci show dhcp reflects maindhcp=1. Tech references for odhcpd and UCI options: docs & source.

Pull Request of this task

3) Remove VLAN interfaces from Babel (run on base ifaces / br-lan)

Why: Historically LibreMesh created per-radio VLANs for Babel. On modern DSA targets this adds complexity and hides interface intent. Running babeld on br-lan (wired) and radios directly is simpler and yields the right link metrics by default.

What: rewrites lime-proto-babeld to:

  • Configure Babel on base ifaces and add a br-lan interface (type=wired).
  • Drop the legacy “VLAN-on-wlan” indirection.
  • Add a small nftables ingress guard on bat0 to prevent Babel L2 multicast flooding into br-lan, which could otherwise create “ghost wired neighbors”.
    (Babel uses UDP 6696 and link-local multicast ff02::1:6 / 224.0.0.111; the guard drops those on bat0.)

Notes: The PR targets DSA (not swconfig). It also documents what to add in lime-node to enable babeld proto explicitly. See PR conversation, examples and nftables snippet.

Pull Request of this task

4) Toward an L3-only LibreMesh profile

What: Today LibreMesh typically mixes BATMAN-adv (L2) and Babel (L3): BATMAN-adv makes the whole mesh look like one big Ethernet (single broadcast domain), while Babel does hop-by-hop IP routing. An L3-only profile would switch fully to Babel for mesh connectivity, drop BATMAN-adv and anygw, and give each node its own LAN prefix.

I opened an issue to scope the work: goals, migration path, and checks the community can run while we iterate.

  • Before (BATMAN-adv + anygw): clients across the mesh often see the same IPv4 gateway address; the mesh feels like one flat LAN.
  • After (L3-only): each node runs a DHCP server for its /24 (or /64 in IPv6). Clients connected to node A live in A’s subnet; traffic to node B’s subnet routes via Babel.

Issue of this task

Minimum plan

  • Drop L2 mesh: exclude BATMAN-adv bits (lime-proto-batadv, ubus-lime-batman-adv) and remove anygw.
  • Keep it pure L3: enable Babel (lime-proto-babeld) on mesh radios and on br-lan. (Related: my PR removes the legacy VLAN-on-wlan layer so Babel runs directly on the base ifaces.)
  • Per-node addressing: use LibreMesh’s main_ipv4_address/main_ipv6_address patterns to give every router its own LAN prefix.
  • DHCP/RA: ensure a DHCP/RA server per node.
  • Gateways: if you want “plug Internet anywhere,” pair L3-only with babeld-auto-gw-mode so default routes are advertised only when WAN is healthy.

Current state (as of Aug 25, 2025)

  • watchcat integration: packaged in lime-packages (app + docs page) and usable today; config lives in LibreMesh UCI, rendering to standard OpenWrt watchcat.
  • odhcpd migration: working prototype with CRDT lease sharing; evaluation and broader testing tracked in Issue #1199.
  • Babel changes: PR #1210 open with review/approvals from maintainers.
  • L3-only profile: Issue #1211 open for community discussion and follow-ups.

What’s left / next steps

  • watchcat: collect feedback from communities and tune sane defaults per device profiles.
  • odhcpd: expand tests (IPv6 RA/DHCPv6 + v4 coexistence), harden failure modes, and benchmark lease propagation on dense meshes. Track in #1199.
  • Babel PR #1210: finalize DSA docs, test swconfig fallback or explicitly gate to DSA-only in code, and validate the bat0 ingress guard on more topologies.
  • L3-only: produce a minimal profile, migration guide (no batman-adv, no anygw).

Lessons learned

  • Upstream keeps projects healthy. Using existing, maintained components cuts long-term maintenance and aligns you with a larger ecosystem.
  • Small, reviewable changes earn fast feedback. I saw how clear PR descriptions, minimal diffs, and reproducible test notes make reviews smoother, because maintainers can reason about the change quickly.
  • Communication is part of the code. Good issues/PRs explain why, not just what; they show logs, decisions, and trade offs.
  • Deleting code is a feature. Replacing custom code with upstream tools (watchcat instead of deferrable-reboot or odhcpd instead of dnsmasq-DHCP) taught me that reducing surface area often improves reliability.
  • Open Source Community. I’ve learned a lot of how open source community works, everyone prioritizes that the other person can understand and use the code in a simple way.

Other links references

Project page

Midterm Blog

Do we need VLAN for Babeld interfaces? (Mailing list discussion)

Conclussion

This summer I worked and learned a lot. I’m proud to have met my tutors’ expectations and contributed to a project like LibreMesh.

I want to thank my mentors Ilario and Javier for trusting and giving me this opportunity, especially their patience and compromise, they were amazing!

Midterm: Simplifying LibreMesh with OpenWrt-Native Solutions

This blog documents my ongoing progress with LibreMesh during Google Summer of Code (GSoC) 2025. Specifically, I’m focusing on integrating OpenWrt-native solutions and simplifying LibreMesh modules.

Currently, I am working on two main tasks:

  1. Replacing deferrable-reboot with watchcat: Migrating LibreMesh’s custom reboot scheduling script to the built-in OpenWrt watchcat package for improved reliability.
  2. Migrating DHCP functionality from dnsmasq to odhcpd: Transitioning DHCP and IPv6 handling to OpenWrt’s native odhcpd, currently in development and community testing.

For testing and validation, I’m using three physical routers configured to simulate realistic network conditions.

Task 1: Integrate OpenWrt’s watchcat via Hardware Detection

a. Motivation

The primary motivation for replacing LibreMesh’s deferrable-reboot script with OpenWrt’s watchcat is to leverage upstream-maintained tools, reduce redundancy, and enhance maintainability. watchcat provides robust functionality including scheduled reboots and network monitoring to automatically reboot routers in case of failure.

The design involves creating a hardware-detection (HWD) module within LibreMesh, enabling dynamic generation of watchcat configurations from user-defined UCI entries.

b. Implementation Details

The new package, lime-hwd-watchcat, is implemented in Lua, performing the following:

  • Unique Section Identification: Creates unique configuration section names prefixed with hardware identifiers to avoid conflicts.
  • Cleaning Up Configurations: Removes previously generated watchcat configurations.
  • Dynamic Configuration Generation: Reads entries from LibreMesh’s UCI configuration (hwd_watchcat) and generates corresponding /etc/config/watchcat entries.
  • Service Management: Automatically reloads watchcat after applying configuration changes using the /etc/init.d/watchcat reload command.

c. Testing & Validation

Testing involved:

  • Configuring custom UCI settings under config hwd_watchcat.
  • Validating automatic generation and updates to /etc/config/watchcat.
  • Confirming the proper removal of old configurations.
  • Checking watchcat service status and reload logs for expected behavior.

Default configuration:

Lets change it with the new package! Using:

uci add lime-node hwd_watchcat
uci set lime-node.@hwd_watchcat[-1].mode='ping_reboot'
uci set lime-node.@hwd_watchcat[-1].pinghosts='4.2.2.2'
uci set lime-node.@hwd_watchcat[-1].pingperiod='30s'
uci set lime-node.@hwd_watchcat[-1].period='6h'
uci set lime-node.@hwd_watchcat[-1].forcedelay='1m'
uci commit lime-node

And after using ‘lime-config’

d. Resources

You can see the code and the PRs for this package here:

Package (Github)

Pull Request approved

Task 2: Replace dnsmasq DHCP with odhcpd

a. Motivation

OpenWrt’s native odhcpd daemon already powers IPv6 RA/DHCPv6 and integrates tightly with ubus. The goal is to phase out dnsmasq’s DHCP functionality in favor of odhcpd, leveraging its superior IPv6 support, integration with OpenWrt, and streamlined lease handling.

The new shared-state-odhcpd_leases package watches local leases, serialises them as CRDT objects via shared-state-async, and injects remote leases back into odhcpd, giving every node the same “view” of the network.

b. Implementation Details

The new package is entirely based in Lua. Its components fall into three small groups:

  1. UCI defaults script (90_odhcpd-lease-share) – executed once at install time.
    • registers a community-scoped CRDT called odhcpd-leases, telling shared-state to refresh every two minutes and to expire entries after twenty;
    • sets two critical odhcpd options:
      leasetrigger points to our publisher script, and maindhcp='1' turns odhcpd into the sole DHCP server;
    • creates a legacy-friendly symlink /etc/ethers → /tmp/ethers.mesh;
    • finally reloads odhcpd so the new trigger takes effect.
  2. Publisher (shared-state-publish_odhcpd_leases) – called by odhcpd whenever a lease changes.
    It fetches the current lease table with ubus call dhcp ipv4leases, distils it to the minimum JSON mapping IP → {mac,hostname}, and pushes that into the CRDT bus with shared-state-async insert odhcpd-leases.
  3. Generator (shared-state-generate_odhcpd_leases) – executed on every CRDT update that the node receives.
    It writes the merged dataset to /tmp/ethers.mesh, moves the file atomically, and reloads odhcpd so the daemon immediately serves and announces the foreign leases as if they were local.

Unit tests live in tests/test_publish_odhcpd_leases.lua; they stub ubus, io.popen and os.execute to validate. The tests run in CI, so regressions in JSON shape or error handling are caught before merging.

c. Testing & Validation

The following testing methods are in progress:

  • Unit tests verify robustness of JSON serialization and shared-state publishing logic, covering normal, empty, and malformed market cases.
  • Deployment tests across three routers:
    • Confirm odhcpd takes over DHCP (uci show dhcp shows maindhcp=1).
    • Check presence and updates of /tmp/ethers.mesh and /etc/ethers symlink.
    • Simulate lease assignment and verify real-time propagation between nodes.
  • Service behavior:
    • Observe leasetrigger invocation on odhcpd-update.
    • Ensure stable operation over lease churn and node restarts.

So, for show how this works, I’ve two routers running LibreMesh, node-1 and node-2.

I connect a device in node-1, then I confirm it with ubus call dhcp ipv4leases '{}'

Then the publisher fires and odhcpd runs the shared-state-publish_odhcpd_leases script, which inserts the JSON blob into the CRDT bus.

Seconds later, on node-2 I dump the CRDT and see the same lease authored by node-1:

This is a minimal example, feel free to test anything you want!

d. Resources

Pull Request of the package,.
Task 2 remains actively in development. Upcoming efforts will involve extensive community testing and careful analysis of how removing dnsmasq‘s DHCP functionality impacts related features and dependencies.

Reflection

The first half of the project required me to dive deeper into LibreMesh’s internals than initially expected, giving me a profound appreciation for this powerful mesh networking tool.

The most valuable lesson was recognizing that removing code (such as the deferrable-reboot script or dnsmasq’s DHCP logic) can be just as rewarding as adding new features. Simplifying the stack enhances its predictability and maintainability, ultimately benefiting the entire LibreMesh community.

Conclussion

After these two initial tasks, LibreMesh now:

  • Reboots through watchcat, an OpenWrt-native tool with LuCI support,
  • Serves and synchronizes DHCP leases via odhcpd and a lightweight CRDT-based sharing mechanism,
  • Incorporates automated tests to ensure reliability and stability through continuous integration (CI).

Looking ahead, I will begin Task 3, removing VLANs from Babel interface, and start prototyping the layer-2-only variant of LibreMesh. I’ll continue employing the methodology proven successful so far: iterative development, backward compatibility, and comprehensive instrumentation.

If future milestones proceed as smoothly, the project will conclude with a cleaner codebase, easier network management, and clearer upgrade paths for community networks.

Simplifying LibreMesh with OpenWrt-Native Solutions

Introduction

Hello everyone!
I’m Agustin Trachta, a Computer Engeneering student at National University of Cordoba (Argentina), and I’m thrilled to be participating in Google Summer of Code 2025 with Freifunk, working on LibreMesh!

This summer, I’ll be focusing on simplifying and modernizing LibreMesh by:

  • Migrating legacy components to OpenWrt-native solutions, such as replacing custom scripts like deferrable-reboot with watchcat, and moving DHCP handling from dnsmasq to odhcpd.
  • Removing unnecessary complexity, like VLANs on Babel interfaces, which have caused compatibility and MTU issues.
  • Creating a layer-2-only variant of LibreMesh tailored for lightweight community deployments, based on BATMAN-adv.

My goal is to help make LibreMesh leaner and more accessible to new users, especially communities that rely on low-resource hardware and need plug-and-play reliability. Many of these networks operate in rural or underserved areas where internet access is limited, budgets are tight, and technical expertise is scarce. In such environments, every kilobyte of firmware matters.

By replacing legacy components with OpenWrt-native solutions, we reduce the need for LibreMesh to maintain parallel tools, making the codebase easier to understand and integrate with upstream developments. Additionally, offering a layer-2-only firmware variant allows communities to deploy simpler, lightweight networks that require minimal configuration, consume fewer resources, and are easier to debug.

What I’ll Be Working On

The first task involves replacing LibreMesh’s custom deferrable-reboot script with watchcat, a integrated OpenWrt package. The original script was created to schdeule periodic reboots in order to recover from possible instability after long uptimes, great idea but it’s already implemented in a more robust way, allowing also to trigger network interface restart based on ping failures. Migrating to watchcat ensures better integration with OpenWrt’s LuCI interface.

The second task focuses on improving DHCP handling by transitioning from dnsmasq to odhcpd. While dnsmasq is widely used and remains excelent for DNS forwarding, it’s not ideal for handling modern DHCPv6 configurations and dynamic lease sharing. This migration has been requested for years by the community, and I will work hard on making this a reality!

The third task is about removing VLAN from Babel routing interfaces. LibreMesh always used VLANs to isolate Layer 3 traffic from BATMAN-adv’s Layer 2 mesh, but it introduced issues like lower MTUs, hardware incompatibilities and added configuration burden. My work will include applying and testing relevant patches that have already been worked on, updating configurations and validating that removing VLANs does not introduce routing lops or instability.

Finally, I’ll be developing a Layer-2-only version of LibreMesh, this is other request from the community members who want a simpler and lighter firmware that doesn’t include routing extra daemons. In small mesh networks, users only need a transparent Layer 2 bridge using BATMAN-adv and gateway node doing NAT, my goal is to create a dedicated firmware profile that includes only the essential packages for a Layer-2 mesh and removes unnecessary services. This variant will help networks that value simplicity, speed, and minimal configuration, especially on older routers with tight resource constraints.

Plan of action

To ensure that each change is reliable, compatible, and valuable to the LibreMesh community, I’ll follow a staged and test-driven approach throughout the project.

I’ll begin by setting up a virtualized test environment to quickly prototype and iterate on changes. In parallel, I’ll be using at least three physical routers compatible with LibreMesh to validate the behavior of the firmware under real-world conditions.

Each task will follow a similar cycle:

  1. Development and Integration of the change in the LibreMesh build system.
  2. Configuration and Testing, both in isolated and multi-node environments.
  3. Documentation and Feedback, where I’ll share results with the community, post updates, and adapt based on what users report.

I’ll be actively engaging with the LibreMesh mailing list, GitHub issues, and chat channels to keep the process transparent and collaborative.

Conclusion

I’m very excited to be working on this project as part of the GSoC 2025. I’m looking forward to collaborating, learning, testing, and sharing throughout the summer!

Also I would like to thank my mentor, Javier Jorge, who will be guiding and teaching me a lot about open source projects and local networks.