Building a DIY SOHO router, Part 3

Building a DIY SOHO router using the Yocto Project build system OpenEmbedded, Part 3

In part two of this series I created a local configuration layer for OpenEmbedded, and had the build target core-image-minimal producing an image. The image that was produced wasn’t really a router, but did let us bring up our board and look around. In this article, I’m going to create a custom image and populate it with additional software packages configured to my  requirements. I’m also going to get started using Over-The-Air (OTA) software updates on the device.

Now that I’ve proven that the image works on the hardware, I can really get down to implementing the project of making a router.  While I could continue to add things to core-image-minimal, it really makes sense at this point to stop and create my own image. Since I want something relatively small, I will still start with core-image-minimal as the base.  Moving back over to meta-local-soho, I’m creating the recipes-core/images directory and then populating core-image-minimal-router.bb with:

require recipes-core/images/core-image-minimal.bb

DESCRIPTION = "Small image for use as a router"

IMAGE_FEATURES += "ssh-server-openssh"
IMAGE_FEATURES += "empty-root-password allow-empty-password allow-root-login"

IMAGE_INSTALL += "\
    "
MENDER_STORAGE_TOTAL_SIZE_MB = "4096"

This tells bitbake that it must have core-image-minimal.bb available and to include it. I then provide a new DESCRIPTION to describe the new image. Next, I include a number of new features in the image. First, I’ll use the normal hook for adding a SSH server. Then I’ll add a line of features for development mode that I’ll remove later.  These features, as their names imply, allow for root to login without a password. This is quite handy for development and quite unwise for production. I’ll circle back and remove these development features later. Next, I give myself an empty list of additional packages to be filled out later. Finally, I tell Mender that it has 4096 megabytes of disk space to work with.  I’m going to hide space from Mender so that I can entirely control that part myself instead. At this point I can build core-image-minimal-router, and it will complete very quickly as I’ve not yet added any packages that have not been previously built. So it’s time to once again git add, git commit, and bitbake these changes.

At this point, I want to flash the new image onto the device and boot it up. The reason for this is that the new image can be used with Mender to test any subsequent image builds. The system is now functional enough to support delivering new image updates via Mender, so it’s good to get into the habit of using the OTA update workflow. It also forces me to treat the device as if it’s really stateless. I’ll talk about how to apply an OTA update when I make the next set of changes.

Now it’s time to begin adding content to the custom image. The first thing I’m going to do is borrow some logic from packagegroup-machine-base. I don’t want to use this packagegroup directly because it will cause bitbake to build a lot of extra stuff that I don’t end up installing. This is due to the fact that it’s part of packagegroup-base.bb (because it’s needed to resolve dependencies of other parts of the packagegroup). Instead, I’m going to add:

    ${MACHINE_EXTRA_RDEPENDS} \
    ${MACHINE_EXTRA_RRECOMMENDS} \

to IMAGE_INSTALL so that any additional machine-specific functionality that’s been specified is installed to the image. Next, I’ll add in kernel-modules to the list so that all of the modules that have been built for the kernel are installed to the image. This will be a lot easier than listing out every module I may need, especially for later on when it comes to various firewall rules I want to use.  On top of all of this, I also want to drop in a bunch of full-versions of common packages I use, and then let busybox fill in the rest.

    bind-utils \
    coreutils \
    findutils \
    iputils-ping \
    iputils-tracepath \
    iputils-traceroute6 \
    iproute2 \
    less \
    ncurses-terminfo \
    net-tools \
    procps \
    util-linux \

Almost everything in this list can be tweaked as desired. There are a couple items that serve a critical purpose and deserve an explanation:

  1. systemd calls out to $PAGER for many functions, including browsing logs with journalctl. If I don’t have the full version of less available, I won’t have a fully functional pager and browsing the output is extremely difficult.
  2. I don’t use xterm for my terminal emulator anymore so I want ncurses-terminfo installed. This ensures that the right terminfo is available and terminal output is correct.

At this point it’s time for a git add, git commit, and then a bitbake of our image too.

Now that I have a new image with additional content to try out, I want to put it on the device and confirm things work. As mentioned before, I’m using Mender in standalone mode since I have a single deployed device.  It’s very simple to serve the new image and then apply it. On the build machine, I do the following (change qemux86-64 to match the machine in use):

$ (cd tmp-glibc/deploy/images/qemux86-64; python3 -m http.server)

And then on the device:

# mender -rootfs http://build-server.local:8000/core-image-minimal-router-qemux86-64.mender
... wait while it downloads and applies ...
# reboot

Once the device comes back up, I’ve logged back in, and confirmed I’m satisfied with my changes, I do:

# mender -commit

This will mark what I am now running as the valid rootfs. However, if the device didn’t boot up or I couldn’t log in, I would simply not commit the changes. To do that I would then just reboot or otherwise power-cycle the device. If I don’t commit the changes to Mender then I get an automatic rollback to the previous install.  Of course, it’s also possible to use any HTTP server on the build machine.

At this point, it’s time to iterate over adding a number of different features that require little more than adding to IMAGE_INSTALL. Since I’ve talked about LXC, I need to add in lxc and gnupg (for verification of containers used from the download template). Once that’s added, I do the git add, git commit, bitbake, and then mender -rootfs cycle again and confirm LXC is working. One thing I noticed when doing this was that containers didn’t autostart because the service isn’t enabled by default.  Since I’m keeping this stateless, I changed that behavior with a bbappend file.  I also ended up installing e2fsprogs-mke2fs to be able to further partition my device to give LXC some room to work with.  This also means that I needed to have base-files provide the fstab that matches my setup, rather than the stock one.  Another small thing to cover is if your hardware does, or does not have a hardware random number genreator available.  If you do have one, you should pull in rng-tools on the image.  If you don’t have one however, you should install haveged to help feed the entropy pool instead.

Now I need to enable a functional access point. This is the first case where it’s really non-trivial to write up the config file to use, so it’s done a little bit differently.  The first step is to install hostapd and iw and boot that.  Now, on the device, edit /etc/hostapd.conf and iterate on editing and testing it on the device until everything is set up as desired. The iw tool can be helpful here to do things like perform a site scan to see what frequencies are already in use.  Once I’m done with the config, I copy the file out from the target and over to my build server with scp as /tmp/hostapd.conf. Then it’s time to make it stateless:

$ mkdir -p recipes-connectivity/hostapd/hostapd
$ cp /tmp/hostapd.conf recipes-connectivity/hostapd/hostapd/

And then I edit recipes-connectivity/hostapd/hostapd_%.bbappend to look like this:

FILESEXTRAPATHS_prepend := ":${THISDIR}/${PN}"

SRC_URI += "file://hostapd.conf"

do_install_append() {
    install -m 0644 ${WORKDIR}/hostapd.conf ${D}${sysconfdir}
}

SYSTEMD_AUTO_ENABLE_${PN} = "enable"

This will do two things. Everything except that last line is to tell bitbake to look in my layer for hostapd.conf and then to install it. The last thing is that now that we have a configured AP we want to start it automatically so have it be an enabled systemd service. Now it’s time once again for the git add, git commit, and so forth cycle.

The next step is to do the same kind of thing to dnsmasq. The good news that this time, the dnsmasq_%.bbappend file only needs one line:

FILESEXTRAPATHS_prepend := ":${THISDIR}/${PN}"

This is because the rest of the recipe already knows to grab dnsmasq.conf from a local file. In the case of my network, I need to pass in a few special options to some DHCP clients and have certain clients be given certain IP addresses, so I’ve gone with dnsmasq as my light-weight, but still fully featured IPv4 configuration server. I could have just as easily gone with ISC DHCPD instead, and it would look much the same as the above.  Conversely, if I didn’t need those few extra rules, I could just let systemd handle DHCP serving.  I left out IPv6 from my statement there as I am letting systemd handle that.

The only thing missing at this point from a router, aside from turning off developer mode features, is to add in a firewall. There are a few ways to go about this.  I already have systemd handling one of the aspects that is often associated with a firewall, setting up IPv4 NAT.  If the only other thing I needed on top of this is to shut the rest of the world out, I can use ufw and potentially even leverage its features that allow for adding iptables commands directly for slight enhancements.  While I have gone that direction for some projects, it’s not a good fit for this one. Instead, I chose to go with arno-iptables-firewall because I’m going to have a more complex setup. The process of customizing the firewall configuration is similar to how I customized hostapd and dnsmasq. That is, I iteratively configure it on the device, test for functionality, and copy the configuration files to my host.  This time, however, the arno-iptables-firewall_%.bbappend will look a little different:

FILESEXTRAPATHS_append := ":${THISDIR}/files"

SRC_URI += "file://firewall.conf \
            file://custom-rules \
"

do_install_append() {
    install -m 0644 ${WORKDIR}/firewall.conf \
    ${D}${sysconfdir}/arno-iptables-firewall/
    install -m 0644 ${WORKDIR}/custom-rules \
    ${D}${sysconfdir}/arno-iptables-firewall/
}

I have two files this time. The first one is the main config file, and the second one is the file that contains my custom rules. This is only necessary because I have a number of custom rules, otherwise it could be omitted.

At this point, looking back at the feature list I laid out in part one, I believe I can check all of my items off now.  I have the following all operational:

  1. access point
  2. firewall
  3. IPv4 and IPv6 network configuration
  4. containers 
  5. OTA software update

I’m building all of my software as hardened as my compiler will allow.  There’s very little state on the router itself to worry about backing up, and everything else is handled by my build server being backed up.  I’m confident in my OTA configuration as I’ve been using it for some time now in the development workflow. I’ve also tweaked the installed package list so that all of my favorite sysadmin tools are available.

At this point, it’s time to lock things down. First up, it’s time to go back to core-image-minimal-router.bb and remove that second line worth of IMAGE_FEATURES. Instead, I’m going to create a new local-user.bb recipe with my own user and SSH key. After listing local-user in IMAGE_INSTALL, I copy meta-skeleton/recipes-skeleton/useradd/useradd-example.bb to somewhere in meta-local-soho, and change it to look like this:

SUMMARY = "SOHO router user"
DESCRIPTION = "Add our own user to the image"
SECTION = "examples"
LICENSE = "MIT"
LIC_FILES_CHKSUM = "file://${COREBASE}/meta/COPYING.MIT;md5=3da9cfbcb788c80a0384361b4de20420"

SRC_URI = "file://authorized_keys"

S = "${WORKDIR}"

inherit useradd

# You must set USERADD_PACKAGES when you inherit useradd. This
# lists which output packages will include the user/group
# creation code.
USERADD_PACKAGES = "${PN}"

USERADD_PARAM_${PN} = "-u 1200 -d /data/trini -r -s /bin/bash trini"

do_install () {
install -d -m 0755 ${D}/data/trini
install -d -m 0700 ${D}/data/trini/.ssh

install -m 0600 ${WORKDIR}/authorized_keys ${D}/data/trini/.ssh/

# The new users and groups are created before the do_install
# step, so you are now free to make use of them:
chown -R trini ${D}/data/trini
chgrp -R trini ${D}/data/trini
}

FILES_${PN} = "/data/trini"

# Prevents do_package failures with:
# debugsources.list: No such file or directory:
INHIBIT_PACKAGE_DEBUG_SPLIT = "1"

Now, there’s one slight problem. I added myself with a user under /data which is excluded from Mender updates. The good thing is I get persistent history and so forth.  The bad thing is I’m not installed there yet.  So I either need to re-flash one last time or manually copy the files over from the filesystem image to the device before I reboot.  Finally, I need to enable myself to use sudo. In addition to adding sudo to IMAGE_INSTALL I also need to either tweak the sudo recipe so that /etc/sudoers.d/ is looked under, tweak it so that anyone in the wheel group can use sudo and add a wheel group, or borrow the example from meta/recipes-core/images/build-appliance-image_15.0.0.bb and do the following in core-image-minimal-router.bb:

# Take the example from recipes-core/images/build-appliance-image_15.0.0.bb
# on adding more sudoers
fakeroot do_populate_poky_src () {
    echo "trini ALL=(ALL) NOPASSWD: ALL" >> ${IMAGE_ROOTFS}/etc/sudoers
}
IMAGE_PREPROCESS_COMMAND += "do_populate_poky_src; "

With all of that built, deployed, and unit tested, it’s time to go live.  My SOHO router is done and ready for production.  It’s now on me to make sure this stays up to date, which in some ways is a lot better than the alternative.  With my previous router, I only had an non-volatile RAM dump specific to the model of router as a backup. I now have my complete configuration containing firewall rules, DHCP options, and more saved. Since starting on the project I have even braved a few OTA updates and had minimal downtime.

This concludes the walk through of building a SOHO router with OpenEmbedded. In the final part of this series, I will describe some of the lessons I learned while designing and implementing this project.

[Go to Part Four of the series.]