6.28. MicroKernel and Overlays

RackHD utilizes microkernel images booted in RAM to perform various operations such as node discovery and firmware management.

The on-imagebuilder repository contains a set of ansible playbooks and roles used for building debian-based linux microkernel images and overlay filesystems. They are primarily used with the on-taskgraph workflow engine.

There are three ansible playbooks included, which build mountable squashfs images, overlay filesystems, and initrd images.

6.28.1. Requirements

  • Any Debian/Ubuntu-based system (support for other distributions coming soon, however simply installing debootstrap and it should work).
  • ansible is installed (apt-get install ansible)
  • Internet access OR network access to an apt cache/proxy server from the build machine

6.28.2. Terms

6.28.3. Bootstrap Process

The images produced by these playbooks are intended to be netbooted and run in RAM. The typical flow for how these images are used/booted is this:

  1. Netboot the kernel and initrd via PXE/iPXE.
  2. The custom-built initrd runs a startup script (roles/initrd/provision_initrd/files/local) that requests a base squashfs image and an overlay filesystem from the boot server.
  3. The initrd mounts both images together (union mount) into a tmpfs and boots into that as the root.

The basefs and initrd images are not intended to be changed very often. It’s more likely that one will add new provisioner roles to build custom overlays that can then be mounted with the base image built by the existing ansible roles in this repository.

6.28.4. Building Images

Instructions for building images, can be found in the on-imagebuilder README.

6.28.5. Adding Provisioner Roles and Configuration Files

Instructions for adding Provisioner Roles and Configuration Files, can be found on-imagebuilder README.

6.28.6. Changing the Global Configuration

All playbooks and roles depend on the variables defined in hosts and group_vars/on_imagebuild. These variables specifies the location of the build roots and the apt server/package repositories that are used.

6.28.7. Changing the Build Root

Update the paths in hosts to the desired build root. The build root paths must be updated in specified in both the [overlay_build_chroot] and the [on_imagebuild:vars] sections.

6.28.8. Changing the Repository URLs

It is highly recommended that an apt-cacher-ng server be used rather than the upstream archive.ubuntu.com server that is specified by default. Depending on the network connection, this can reduce the build times for the basefs and initrd images by 50%.

To do this:

  1. Run apt-get install apt-cacher-ng.

  2. Edit the apt_server variable in group_vars/on_imagebuild to equal the address of your apt cache server, for example:

    apt_server: 192.168.100.5:3142
    

The first build will still be slow, because no packages are cached, but subsequent builds will be much faster.

Note: The playbooks can only be run LOCALLY – not against remote hosts as is usual with Ansible. This is because the chroot ansible_connection type used for most builders and provisioners is not supported over ssh and other remote ansible_connection types.

6.28.9. Why Not Containers?

The goal is to optimize for size on disk and modularity. By creating many different overlays that share a base image, we avoid data duplication on the boot server (50MB base image + 10 * 5MB overlay archives vs. 10 * 55MB container images).

Additionally, it gives us flexibility to update the base image and any system dependencies/scripts/etc. on it without having to rebuild any overlays. For example, we use a custom rc.local script in the base image that is used to receive commands from workflows on startup. Making changes to this script should only have to be done in one place.

Please send us a note if you think this is incorrect! So long as our design constraints are preserved, we are more than open to leveraging existing container technology.

6.28.10. How To Login Microkernel

By default, RackHD has a workflow to let users login Ubuntu based microkernel to debug. The workflow name is Graph.BootstrapUbuntu.

curl -X POST -H 'Content-Type: application/json' <server>/api/current/nodes/<identifier>/workflows?name=Graph.BootstrapUbuntu

When this workflow is running, it will set node to PXE boot, then reboot the node. The node will boot into Ubuntu microkernel, finally you could SSH login node’s microkernel from the RackHD server. The node’s IP address could be retrieved from ‘GET /lookups’ API like below, the SSH username:password is monorail:monorail.

curl <server>/api/current/lookups?q=<identifier>