I have been a Linux user space programmer for well over ten years, but only recently dipped my toe into kernel development. In this article I’ll go over the Linux kernel build system, bare-bones initramfs system emulation, debugging code that runs in kernel space, kernel modules, and testing kernel code within a full Linux distribution.
Requirements
I will assume that we are working in a standard bash-like shell on a modern Linux system.
We will first need to install the Linux kernel build dependencies. More comprehensively, we will need binutils, gcc, make, bc, cpio, flex, bison, tar, xz, gzip, rsync, perl, and python. We’ll also need the command-line tools, headers, and libraries for elfutils, gettext, and openssl.
These should be available through a package manager on every Linux distribution. To install the header and pkg-config files alongside a library, the development version of a package may be required (generally named with a -dev or -devel suffix). E.g. on Alpine Linux we would need to install the elfutils-dev package.
For system emulation we will need the qemu-system-ARCH emulator for our target architecture. To take advantage of KVM (and avoid having to set up a cross-compilation toolchain), the emulated system architecture should match our physical hardware. E.g. if our host system is running on an x86_64 CPU, then we should install and use qemu-system-x86_64.
Finally, we will need the GNU debugger gdb to debug kernel and user space code.
Building the Linux kernel
The Linux kernel uses a make-based build system called Kbuild and a custom Kconfig configuration system. For now we’ll simply pull the Linux source and use make to configure and build the kernel.
$ mkdir linux_kvm_debug
$ cd linux_kvm_debug
$ wget https://github.com/torvalds/linux/archive/refs/tags/v6.17.tar.gz
$ tar -xvf v6.17.tar.gz
We’ll use the following base config file with the options required for qemu KVM emulation and GDB kernel debugging.
# mini.config
# minimum for `qemu` toybox initramfs boot
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_SYSCTL=y
CONFIG_UNIX98_PTYS=y
CONFIG_SWAP=y
CONFIG_BINFMT_SCRIPT=y
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_BLK_DEV_WRITE_MOUNTED=y
# networking
CONFIG_ETHERNET=y
CONFIG_UNIX=y
CONFIG_IPV6=y
CONFIG_FORCEDETH=y
CONFIG_E1000=y
CONFIG_E1000E=y
# add ext4 filesystem to mount volumes
CONFIG_EXT4=y
# enable modules
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# allow kernel GDB debugging
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_FRAME_POINTER=y
CONFIG_RANDOMIZE_BASE=n # optional, may use `nokaslr` parameter instead
# KVM guest (from `kernel/configs/kvm_guest.config`)
CONFIG_NET=y
CONFIG_NET_CORE=y
CONFIG_NETDEVICES=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV=y
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_INET=y
CONFIG_TTY=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_BINFMT_ELF=y
CONFIG_PCI=y
CONFIG_PCI_MSI=y
CONFIG_DEBUG_KERNEL=y
CONFIG_VIRTUALIZATION=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
CONFIG_KVM_GUEST=y
CONFIG_S390_GUEST=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_MENU=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BLK=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_VIRTIO_NET=y
CONFIG_9P_FS=y
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
CONFIG_SCSI_LOWLEVEL=y
CONFIG_SCSI_VIRTIO=y
CONFIG_VIRTIO_INPUT=y
CONFIG_DRM_VIRTIO_GPU=y
The allnoconfig command will use the KCONFIG_ALLCONFIG file as a base, set basic architecture defaults, and disable all other config options.
$ mkdir linux-6.17_build
$ make -C linux-6.17 O=$PWD/linux-6.17_build \
> ARCH=x86_64 \
> KCONFIG_ALLCONFIG=$PWD/mini.config \
> allnoconfig
$ make -C linux-6.17 O=$PWD/linux-6.17_build -j
The compiled kernel image will be located in linux-6.17_build/arch/x86/boot/bzImage. If you are using a different architecture, the process should be the same but with x86/x86_64 swapped for your host architecture.
Creating an initial RAM filesystem
The primary way to boot a Linux system is to use an initial RAM filesystem (initramfs). An initramfs is simply a directory that has been packaged into a cpio archive (and optionally compressed). On boot the initramfs is mounted at / and then /init is executed as the initial user space process.
Note: An initramfs can be built into the kernel binary itself, or be provided separately. We will need to boot from several different initramfs archives to test our debug Kernel, so we’ll specify the initramfs archive to use on each boot.
For our first initramfs we’ll create a C program that prints "Hello, World!". Note that if we use the host C compiler with its default flags, then it will compile a binary targeting the C “runtime” for the host system. I.e. it will assume that the host lib directories (/lib, /usr/lib, etc.) are present in the initial RAM filesystem.
We could copy the relevant parts of our host sysroot directly into the initramfs, but an easier and more reliable option is to use the nolibc static C runtime provided by the Linux kernel. The kernel build system exposes a make command to produce a nolibc build sysroot.
Note: We could also use a feature complete libc that supports static linking, e.g. musl. This is generally the way to go when you need a drop-in replacement for GNU libc to build an existing project. We’ll use nolibc here because we will be writing our C code from scratch and it is simple, interesting, and provided with the kernel source code.
$ make -C linux-6.17/tools/include/nolibc O=$PWD/linux-6.17_build \
> headers_standalone
By default, the Linux kernel will mount the initial RAM filesystem as / and look for a /init executable to run as the first user-space process. We will write a basic init.c file and compile a statically linked init executable.
/* init_c/init.c */
/* crt.h: exports `_start` symbol to init runtime and call `main` */
#include <crt.h>
/* sys.h: defines `reboot` function */
#include <sys.h>
/* unistd.h: defines `sleep`, `read`, and `write` functions */
#include <unistd.h>
int main(void) {
/* try to wait for kernel boot messages to finish */
sleep(1);
/* write our message to standard output */
const char msg[] = "Hello, World!\n";
write(STDOUT_FILENO, msg, sizeof(msg));
/* wait for user to press enter by reading from standard input */
char buff[1];
read(STDIN_FILENO, buff, sizeof(buff));
/* reboot the machine */
reboot(LINUX_REBOOT_CMD_RESTART);
}
$ gcc -static -fno-stack-protector -nostdlib -nostdinc \
> -I linux-6.17_build/sysroot/include \
> -o init_c/init \
> init_c/init.c
The init static executable will wait one second, write "Hello, World!", and then restart the computer after the user presses enter. Next we’ll use cpio to package the init_c directory into an initramfs newc archive.
$ cd init_c
$ find . | cpio -o --format=newc > ../init_c.cpio
$ cd ..
With a kernel and an initial RAM filesystem containing an init executable, we can boot an emulated system using qemu.
$ qemu-system-x86_64 --enable-kvm -nographic -no-reboot \
> -kernel linux-6.17_build/arch/x86/boot/bzImage \
> -initrd init_c.cpio \
> -append "HOST=x86_64 console=ttyS0"
(vm) Linux version 6.17.0 <SYSTEM SPECIFIC STUFF>
(vm) Command line: HOST=x86_64 console=ttyS0
(vm) # ...
(vm) Run /init as init process
(vm) Hello, World!
You should see a string of kernel boot messages, followed by Hello, World!. Pressing the Enter key should kill the virtual machine.
Debugging the Kernel with GDB
In this section we will attach a gdb session to the kernel running in our qemu virtual machine. Our kernel was built with debug information enabled and Kernel Address Space Layout Randomization (KASLR) disabled, so there are only a few things left to do before we can open gdb.
First, we need to build the kernel’s gdb python scripts. We also need to add a line to our gdbinit config file to inform gdb that these scripts are safe to load.
$ make -C linux-6.17 O=$PWD/linux-6.17_build scripts_gdb
$ mkdir -p ~/.config/gdb
$ echo "set auto-load safe-path $PWD/linux-6.17_build/" \
> >> ~/.config/gdb/gdbinit
Note: The command appended above will overwrite any existing auto-load safe paths set in gdbinit. If your gdbinit needs to allow auto-loading scripts from multiple paths, then you must make single set call listing all safe paths.
Finally, when we run the VM we’ll have to add a couple of extra flags.
$ qemu-system-x86_64 --enable-kvm -nographic -no-reboot -s -S \
> -kernel linux-6.17_build/arch/x86/boot/bzImage \
> -initrd init_c.cpio \
> -append "HOST=x86_64 console=ttyS0"
The -s flag (short for -gdb tcp::1234) hosts a gdb server at address :1234. The -S flag halts the CPU so the kernel will not start booting before our gdb session can connect to the server.
We can now connect a gdb session and begin debugging.
$ gdb -q linux-6.17_build/vmlinux
Reading symbols from /home/aven/linux_kvm_debug/linux-6.17_build/vmlinux...
(gdb) target remote :1234
Remote debugging using :1234
0x000000000000fff0 in ?? ()
(gdb) hbreak start_kernel
Hardware assisted breakpoint 1 at 0xffffffff81a6d630: file /home/aven/linux_kvm_debug/linux-6.17/init/main.c, line 899.
(gdb) c
Continuing.
Breakpoint 1, start_kernel ()
at /home/aven/linux_kvm_debug/linux-6.17/init/main.c:899
899 char *command_line;
(gdb)
Above we open the uncompressed kernel binary vmlinux with gdb, connect to the qemu gdbserver target at address :1234, place a hardware breakpoint at start_kernel, and unpause the CPU until the breakpoint is hit. From here we can step through the kernel startup as we we please.
However, we know that our init binary will eventually be run and issue a write system call. We can search the kernel for the function name that will be called for write.
$ grep -A4 -rn "SYSCALL_DEFINE.\?(write," linux-6.17/
fs/read_write.c:746:SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
fs/read_write.c-747- size_t, count)
fs/read_write.c-748-{
fs/read_write.c-749- return ksys_write(fd, buf, count);
fs/read_write.c-750-}
So we could either break on the ksys_write symbol (hbreak ksys_write) or use the source location to set a breakpoint directly in the syscall (hbreak fs/read_write.c:749). The result I see continuing from the start_kernel breakpoint we hit above is shown below.
(gdb) hbreak fs/read_write.c:749
Hardware assisted breakpoint 3 at 0xffffffff814156a4: file /home/aven/linux_kvm_debug/linux-6.17/fs/read_write.c, line 749.
(gdb) c
Continuing.
Breakpoint 2, __do_sys_write (fd=1, buf=0x7ffde9ee7c51 "Hello, World!\n", count=15)
at /home/aven/linux_kvm_debug/linux-6.17/fs/read_write.c:749
749 return ksys_write(fd, buf, count);
(gdb)
We interrupted the kernel side of the "Hello, World!" write system call from our init process!
Adding a shell to our initramfs with toybox
Our current initramfs contains a single init binary we compiled using nolibc. While such single binary systems can be useful for testing or building bespoke devices, in general we expect Linux systems to provide a Unix style shell with a standard set of command line utilities. Most GNU+Linux distributions use the GNU core utilities, but these are a somewhat heavy dependency and provide far more than we need for a simple debug system.
There are several Unix-in-a-box projects that provide a single “Swiss army knife binary” containing all of the basic command-line utilities, the most popular being busybox and toybox. We’ll be using toybox for our initramfs in this section, but later we’ll build an Alpine Linux system based on busybox.
$ wget https://landley.net/toybox/downloads/binaries/latest/toybox-x86_64
$ chmod +x toybox-x86_64
$ mkdir -p init_toybox/bin
$ mkdir -p init_toybox/dev
$ mkdir -p init_toybox/home
$ mkdir -p init_toybox/proc
$ mkdir -p init_toybox/sys
$ mv toybox-x86_64 init_toybox/bin/toybox
$ cd init_toybox/bin
$ for i in ; do ln -s toybox $i; done
$ cd ../..
Note: If you are uncomfortable downloading the toybox binary, you may instead build it from source. You will need to use a C toolchain capable of building static x86_64 binaries, e.g. a gcc or clang toolchain based on musl libc.
We now have an init_toybox sysroot containing a bin directory with a symlink file for each of the toybox commands. The toybox binary checks the filename it was executed as to determine the command to execute, and running the toybox binary itself without any arguments prints a list of command names. E.g. the for loop above created the symlink bin/sh to bin/toybox, and running ./bin/sh is equivalent to executing ./bin/toybox sh which starts a shell.
Instead of compiling a static init ELF binary from C, we will write a shell script to be executed with bin/sh. The init script below is inspired by the init script generated by toybox/mkroot.
Note: When a shebang (#!) appears at the start of a text file, the file may be used as if it were a binary executable. The path following the shebang is executed with the path to the original text file passed as an argument.
#!/bin/sh
# init_toybox/init
export HOME=/home PATH=/bin
mount -t proc proc proc
mount -t sysfs sys sys
mount -t devtmpfs dev dev
ln -sf /proc/self/fd/0 /dev/stdin
ln -sf /proc/self/fd/1 /dev/stdout
ln -sf /proc/self/fd/2 /dev/stderr
mkdir -p dev/shm && chmod +t /dev/shm
mkdir -p dev/pts && mount -t devpts dev/pts dev/pts
sleep 1
echo "Welcome to sh!"
setsid -c /bin/sh <>/dev/ttyS0 >&0 2>&1
reboot -f
Our init script sets up the special /proc, /sys, /dev, /dev/pts, and /dev/shm Linux directories, sleeps for one second, prints a welcome message, and starts a new session running /bin/sh. When the shell session exits, the system is rebooted.
The setsid line is confusing to parse if you aren’t familiar with the arcane semantics of Unix shells, so lets break it down piece-by-piece.
- The command
setsid -c /bin/shstarts a new session with the same controllingttyas the current session, running/bin/shas its initial process. - The
n<>filecommand opensfilefor reading and writing on file descriptorn. If nonis specified, then it defaults to descriptor0. Thus<>/dev/ttyS0opens/dev/ttyS0for reading and writing on file descriptor0. - The
n>&mcommand makes file descriptorna copy of the output file descriptorm. Ifnis not specified, then it defaults to descriptor1. Thus>&0makes descriptor1an output copy of descriptor0, and2>&1makes descriptor2an output copy of descriptor1.
Taken together, the setsid line creates a new session with the same controlling tty as init that runs /bin/sh in its initial process with /dev/ttyS0 open as input on file descriptor 0 and output on descriptors 1 and 2. In other words, it starts a shell session with stdin, stdout, and stderr bound to ttyS0.
Note: We specified the initial active tty to be ttyS0 in our qemu parameters, but that probably shouldn’t be hard-coded into the script. The active tty should instead be parsed from /sys/class/tty/console/active, e.g. using sed as shown below.
setsid -c /bin/sh <>/dev/ >&0 2>&1
We may now package the init_toybox directory into an initramfs and boot the kernel to an sh shell.
$ chmod +x init_toybox/init
$ cd init_toybox
$ find . | cpio -o --format=newc > ../init_toybox.cpio
$ cd ..
$ qemu-system-x86_64 --enable-kvm -nographic -no-reboot -s \
> -kernel linux-6.17_build/arch/x86/boot/bzImage \
> -initrd init_toybox.cpio \
> -append "HOST=x86_64 console=ttyS0"
(vm) Linux version 6.17.0 <SYSTEM SPECIFIC STUFF>
(vm) Command line: HOST=x86_64 console=ttyS0
(vm) # ...
(vm) Run /init as init process
(vm) Welcome to sh!
(vm) $
Building and debugging kernel modules
The Linux kernel allows code to be dynamically loaded at runtime through modules. In that sense, modules are a loose kernel space equivalent to user space shared libraries.
Kernel modules are generally built using the Kbuild makefiles from the source tree of the Linux kernel being targeted. In this section we’ll write a simple "Hello, Kernel!" character device module that registers a /dev/hello device that a message can be read from.
/* hello_mod/hello.c */
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/fs.h>
#include <linux/module.h>
#include <linux/printk.h>
#include <linux/types.h>
#include <linux/uaccess.h>
#include <linux/version.h>
#include <asm/errno.h>
#define DEVICE_NAME "hello"
static int major;
static struct class *cls;
static int hello_open(struct inode *inode, struct file *file);
static int hello_release(struct inode *inode, struct file *file);
static ssize_t hello_read(struct file *file, char __user *buf, size_t len,
loff_t *off);
static ssize_t hello_write(struct file *file, const char __user *buf,
size_t len, loff_t *off);
static struct file_operations hello_cdev_fops = {
.read = hello_read,
.write = hello_write,
.open = hello_open,
.release = hello_release,
};
static int __init hello_init(void)
{
major = register_chrdev(0, DEVICE_NAME, &hello_cdev_fops);
if (major < 0) {
pr_alert("failed to register cdev %d\n", major);
return major;
}
#if LINUX_VERSION_CODE >= KERNEL_VERSION(6, 4, 0)
cls = class_create(DEVICE_NAME);
#else
cls = class_create(THIS_MODULE, DEVICE_NAME);
#endif
device_create(cls, NULL, MKDEV(major, 0), NULL, DEVICE_NAME);
pr_info("created /dev/%s\n", DEVICE_NAME);
return 0;
}
static void __exit hello_exit(void)
{
device_destroy(cls, MKDEV(major, 0));
class_destroy(cls);
unregister_chrdev(major, DEVICE_NAME);
pr_info("destroyed /dev/%s\n", DEVICE_NAME);
}
static int hello_open(struct inode *inode, struct file *file)
{
return 0;
}
static int hello_release(struct inode *inode, struct file *file)
{
return 0;
}
static ssize_t hello_read(struct file *file, char __user *buf, size_t len,
loff_t *offset)
{
const char msg[] = "Hello, User!\n";
const char *msg_off;
size_t msg_len;
unsigned long res;
if (*offset >= sizeof(msg)) {
*offset = 0;
return 0;
}
msg_off = msg + *offset;
msg_len = min(len, strlen(msg_off));
res = copy_to_user(buf, msg_off, msg_len);
*offset += (msg_len - res);
return msg_len - res;
}
static ssize_t hello_write(struct file *file, const char __user *buf,
size_t len, loff_t *off)
{
pr_alert("unsupported operation\n");
return -EINVAL;
}
module_init(hello_init);
module_exit(hello_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Aven Bross <email@example.com>");
MODULE_DESCRIPTION("A kernel space hello world");
MODULE_VERSION("0.1");
We can build the module with make using the following Kbuild file.
# hello_mod/Kbuild
obj-m := hello.o
$ make -C linux-6.17_build M=$PWD/hello_mod
Now we should have a hello.ko module object in the source directory.
$ modinfo hello_mod/hello.ko
filename: /home/aven/linux_kvm_debug/hello_mod/hello.ko
version: 0.1
description: A kernel space hello world
author: Aven Bross <email@example.com>
license: GPL
srcversion: 42A234CAB9C4A0A5982807E
depends:
name: hello
vermagic: 6.17.0 mod_unload
If we want to be able to run make in the hello_mod directory to build the module, then we can add a stub Makefile as follows.
# hello_mod/Makefile
.PHONY: all clean
all:
-C ../linux-6.17_build M=
clean:
-C ../linux-6.17_build M= clean
A simple way to have the module available on our virtual machine is to package it directly into the initramfs.
$ cp hello_mod/hello.ko init_toybox/home/
$ cd init_toybox
$ find . | cpio -o --format=newc > ../init_toybox_hello.cpio
$ cd ..
$ qemu-system-x86_64 --enable-kvm -nographic -no-reboot -s \
> -kernel linux-6.17_build/arch/x86/boot/bzImage \
> -initrd init_toybox_hello.cpio \
> -append "HOST=x86_64 console=ttyS0"
(vm) Linux version 6.17.0 <SYSTEM SPECIFIC STUFF>
(vm) Command line: HOST=x86_64 console=ttyS0
(vm) # ...
(vm) Run /init as init process
(vm) Welcome to sh!
(vm) $ insmod home/hello.ko
(vm) hello: loading out-of-tree module taints kernel.
(vm) created /dev/hello
(vm) $ lsmod
(vm) Module Size Used by
(vm) hello 12288 0
(vm) $ cat /dev/hello
(vm) Hello, User!
(vm) $ rmmod hello
(vm) destroyed /dev/hello
(vm) $ lsmod
(vm) Module Size Used by
(vm) $ dmesg
(vm) [ 0.000000] Linux version 6.17.0 <VERSION SPECIFIC STUFF>
(vm) [ 0.000000] Command line: HOST=x86_64 console=ttyS0
(vm) # ...
(vm) [ 0.183990] Run /init as init process
(vm) [ 0.184379] with arguments:
(vm) [ 0.184381] /init
(vm) [ 0.184382] with environment:
(vm) [ 0.184383] HOME=/
(vm) [ 0.184384] TERM=linux
(vm) [ 0.184384] HOST=x86_64
(vm) [ 36.293897] hello: loading out-of-tree module taints kernel.
(vm) [ 36.298818] created /dev/hello
(vm) [ 65.639464] destroyed /dev/hello
To debug our module we will connect a gdb session in the same way as the previous section. Thanks to the kernel gdb scripts, we can set breakpoints in the hello module and have the associated debug symbols automatically loaded when the module is loaded.
$ gdb -q linux-6.17_build/vmlinux
Reading symbols from linux-6.17_build/vmlinux...
(gdb) target remote :1234
Remote debugging using :1234
0xffffffff815d77f3 in pv_native_safe_halt () at /home/aven/linux_kvm_debug/linux-6.17/arch/x86/kernel/paravirt.c:82
82 }
(gdb) lx-symbols
loading vmlinux
(gdb) b hello_read
Function "hello_read" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (hello_read) pending.
(gdb) c
Continuing.
Back in the VM, we can now load the module.
(vm) $ insmod /home/hello.ko
(vm) created /dev/hello
And watch as the symbols are loaded in gdb.
scanning for modules in /home/aven/linux_kvm_debug
loading @0xffffffffa0000000: /home/aven/linux_kvm_debug/hello_mod/hello.ko
Finally, we can try reading from /dev/hello.
(vm) cat /dev/hello
And hit our breakpoint.
Breakpoint 1, hello_read (file=0xffff8880021c0500, buf=0x6b87c0 "$ in/toybox", len=4096, offset=0xffff88800293bf08)
at hello.c:77
77 const char msg[] = "Hello, World!\n";
(gdb)
Debugging in a full Linux distribution
A basic toybox system will not always be a satisfactory test environment. For example, we may need to test how a variety of user space tools interact with our Kernel code, or otherwise investigate how kernel code integrates into a fully fledged Linux operating system. In this section we’ll go over how to install a Linux distribution on a virtual drive using qemu, and then how to build, install, and debug a custom Linux kernel and/or kernel modules within the virtual machine.
We’ll be using Alpine Linux because it is a small and simple distro based on musl libc and the busybox Swiss army knife. The same basic process should work for any distribution, but the package manager, init system, and bootloader may differ.
$ mkdir vm_alpine
$ cd vm_alpine
$ wget https://dl-cdn.alpinelinux.org/alpine/v3.22/releases/x86_64/alpine-virt-3.22.2-x86_64.iso
$ truncate -s 12G hda.img
$ qemu-system-x86_64 --enable-kvm -cpu host -m 512M -smp 2 \
> -drive file=hda.img,format=raw \
> -cdrom alpine-virt-3.22.2-x86_64.iso \
> -boot d
First, we pull a virtual machine .iso image for Alpine Linux. Then we create a 12GB file that will serve as a raw disk image and store the partitions for our virtual machine. We’ll need at least that much disk space to fit our Alpine install and still have room to build and install a custom Linux kernel.
Then we boot a qemu virtual machine from the Alpine .iso image. The -m argument specifies the memory to allocate to the VM and -smp specifies the number of processor cores. The -drive argument specifies our new hda.img image to be the first hard disk provided to the VM. The -cdrom argument specifies the Alpine .iso file to be provided as a CD ROM device. The -boot d argument tells the VM to boot from the CD drive.
You may optionally add -nographic and -display curses to tell qemu to use ncurses to display the VM within the terminal emulator instead of a separate graphical window.
Once the machine boots, we should be presented with a login: prompt. We may enter root to move to a root shell environment.
Next we’ll execute setup-alpine and accept the default option for every prompt except for time zone, user, and disk drive. We’ll use our local time zone, enter no when asked to setup a user, and at the disk prompt set sda as a sys drive. This will create the standard boot, swap, and root partitions on our hda.img drive and format them accordingly.
After completing the installation, we can run poweroff to shut down the VM. If everything worked as intended, we should be able to boot a virtual machine directly from the hda.img disk.
$ qemu-system-x86_64 --enable-kvm -cpu host -m 512M -smp 2 \
> -drive file=hda.img,format=raw
We can now log in as root with the password we set up during installation.
Exiting back to our host system, we can also use our hda.img disk as a loop device to mount the VM filesystem within our host filesystem. Then we can use chroot to work with the Alpine sysroot without the indirection of a virtual machine.
# chroot.sh
# mount the hda image in our host filesystem
losetup /dev/loop0 -P hda.img
mkdir $PWD/mnt
mount /dev/loop0p3 $PWD/mnt
mount /dev/loop0p1 $PWD/mnt/boot
# setup `/dev`, `/proc`, and `/sys` in `mnt`
mknod -m 666 $PWD/mnt/dev/full c 1 7
mknod -m 666 $PWD/mnt/dev/ptmx c 5 2
mknod -m 644 $PWD/mnt/dev/random c 1 8
mknod -m 644 $PWD/mnt/dev/urandom c 1 9
mknod -m 666 $PWD/mnt/dev/zero c 1 5
mknod -m 666 $PWD/mnt/dev/tty c 5 0
mount -t proc none $PWD/mnt/proc
mount -o bind /sys $PWD/mnt/sys
# replace VM's name resolution conf with host's
cp $PWD/mnt/etc/resolv.conf resolv.conf
cp /etc/resolv.conf $PWD/mnt/etc/resolv.conf
# `chroot` into an `ash` shell rooted in `mnt`
chroot $PWD/mnt /bin/ash -l
# restore VM's name resolution conf
mv resolv.conf $PWD/mnt/etc/resolv.conf
# clean up `/dev`, `/proc`, and `/sys` in `mnt`
umount $PWD/mnt/proc
umount $PWD/mnt/sys
rm -f $PWD/mnt/dev/*
# flush changes to filesystem
sync
# unmount the hda image from our host filesystem
umount $PWD/mnt/boot
umount $PWD/mnt
losetup -D /dev/loop0
We will need to run the chroot.sh script with root permissions. E.g. run the script with sudo or first su and then execute the script.
From within our Alpine Linux chroot shell we can install the packages required to build a custom Linux kernel, then pull and build a debug kernel.
Note: All of this could be done from within the VM of course, but I find it much more efficient to work in a shell directly on the host system.
$ chmod +x chroot.sh
$ sudo ./chroot.sh
(chroot) $ apk add build-base linux-headers linux-virt-dev diffutils openssl-dev
(chroot) $ cd /root
(chroot) $ wget https://github.com/torvalds/linux/archive/refs/tags/v6.17.tar.gz
(chroot) $ tar -xvf linux-6.17.tar.gz
(chroot) $ mkdir linux-6.17_build
(chroot) $ cp /boot/config-* linux-6.17_build/.config
(chroot) $ echo "
(chroot) > CONFIG_DEBUG_INFO=y
(chroot) > CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
(chroot) > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
(chroot) > CONFIG_FRAME_POINTER=y
(chroot) > CONFIG_RANDOMIZE_BASE=n
(chroot) > CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
(chroot) > " >> linux-6.17_build/.config
(chroot) $ make -C linux-6.17 O=$PWD/linux-6.17_build olddefconfig
(chroot) $ make -C linux-6.17 O=$PWD/linux-6.17_build -j
(chroot) $ make -C linux-6.17 O=$PWD/linux-6.17_build scripts_gdb
We can then install our custom kernel and build an associated initramfs.
(chroot) $ make -C linux-6.17 O=$PWD/linux-6.17_build modules_install
(chroot) $ cp linux-6.17_build/arch/x86/boot/bzImage /boot/vmlinuz-6.17.0
(chroot) $ cp linux-6.17_build/.config /boot/config-6.17.0
(chroot) $ mkinitfs -o /boot/initramfs-6.17.0 -b / 6.17.0
Finally, we need to modify our bootloader configuration to add a default entry to boot from our new kernel and initramfs. The default bootloader for Alpine is syslinux. Below we use the update-extlinux utility, but the /boot/extlinux.conf file is fairly simple and may be edited manually.
However, the bootloader needs to know the disk device the system will be booting from, in our case /dev/sda. But our chroot environment is mounted within our host system with no /dev/sda device (and if we added the /dev/sda node, it would be the /dev/sda device for the host, not the virtual machine). Thus we’ll need to exit the chroot and boot the VM in order to configure the bootloader.
(chroot) $ exit
$ qemu-system-x86_64 --enable-kvm -cpu host -m 512M -smp 2 \
> -drive file=hda.img,format=raw
After logging in once more, we can set our new 6.17.0 kernel to be the default and update the extlinux bootloader config.
(vm) $ echo "
(vm) > default=0
(vm) > " >> /etc/update-extlinux.conf
(vm) $ update-extlinux
Finally, we can shut down the VM and boot our freshly compiled debug kernel. But first we’ll install gdb inside the VM.
(vm) $ apk add gdb
(vm) $ mkdir -p /root/.config/gdb
(vm) $ echo "set auto-load safe-path /root/linux-6.17_build/"
(vm) > > /root/.config/gdb/gdbinit
(vm) $ poweroff
Now we can boot the VM for debugging. We’ll add the -snapshot argument so that filesystem changes are written to a temporary snapshot rather than to hda.img. This avoids corruption when we use the mounted filesystem in a chroot at the same time that the VM is running.
$ qemu-system-x86_64 --enable-kvm -cpu host -m 512M -smp 2 -s -S \
> -snapshot -drive file=hda.img,format=raw &
After logging into the VM, we can verify that the correct kernel booted by running uname.
(vm) $ uname -r
(vm) 6.17.0
It is similarly important to avoid modifying hda.img from our chroot while the VM is running. Thus we’ll make a second chroot script that mounts the VM filesystem as read-only and jumps directly into gdb.
# gdb.sh
# mount the hda root partition as read-only in our host filesystem
losetup /dev/loop0 -r -P hda.img
mkdir $PWD/mnt
mount -o ro /dev/loop0p3 $PWD/mnt
# `chroot` into a gdb session
chroot $PWD/mnt /usr/bin/gdb -q
# unmount the hda image from our host filesystem
umount $PWD/mnt
rmdir $PWD/mnt
losetup -D /dev/loop0
Running this gdb.sh script should present us with a gdb session in a read-only copy of our Alpine sysroot.
$ chmod +x gdb.sh
$ sudo ./gdb.sh
(chroot) (gdb) file /root/linux-6.17_build/vmlinux
(chroot) Reading symbols from /root/linux-6.17_build/vmlinux...
(chroot) (gdb) target remote :1234
(chroot) Remote debugging using :1234
(chroot) 0x000000000000fff0 in ?? ()
(chroot) (gdb) hbreak start_kernel
(chroot) Hardware assisted breakpoint 1 at 0xffffffff82a01280: file /root/linux-6.17/init/main.c, line 898.
(chroot) (gdb) c
(chroot) Continuing.
(chroot)
(chroot) Breakpoint 1, start_kernel () at /root/linux-6.17/init/main.c:898
(chroot) 898 {
(chroot) (gdb)
We now have a simple process to develop the Linux kernel and/or out-of-tree kernel modules in a chroot, and debug the kernel space code by attaching gdb to a qemu VM.
Note that we also have gdb available within the VM itself, so it is possible to simultaneously debug user space code that is interacting with the kernel.
Building and debugging OpenZFS
In this section we’ll build a debug version of the ZFS filesystem kernel modules and user space tools, and debug them in our Alpine VM.
First we’ll chroot into our Alpine sysroot, install a few required packages, download the ZFS source code release, and build the project.
$ sudo ./chroot.sh
(chroot) $ apk add libtool automake autoconf util-linux-dev \
(chroot) > libtirpc-dev gettext-dev zfs-udev
(chroot) $ cd /root
(chroot) $ wget https://github.com/openzfs/zfs/archive/refs/tags/zfs-2.4.0-rc2.tar.gz
(chroot) $ tar -xvf zfs-2.4.0-rc2.tar.gz
(chroot) $ cd zfs-2.4.0-rc2
(chroot) $ export CFLAGS="-fno-tree-vectorize"
(chroot) $ export CXXFLAGS="-fno-tree-vectorize"
(chroot) $ export LIBS="-lintl"
(chroot) $ sh autoconf.sh
(chroot) $ ./configure --with-linux=/root/linux-6.17 \
(chroot) > --with-linux-obj=/root/linux-6.17_build \
(chroot) > --with-tirpc \
(chroot) > --with-udevdir=/usr/lib/udev \
(chroot) > --disable-systemd \
(chroot) > --enable-debuginfo \
(chroot) > --enable-debug
(chroot) $ make -j
(chroot) $ exit
Note: The environment variables, dependency packages, and ./configure arguments used above were based on the Alpine Linux zfs package. For any distro, looking at the package repository is generally a good strategy to discover how a piece of software can be built from source.
Our Alpine sysroot should now have a fresh debug version of ZFS built against our debug Linux 6.17 kernel. We can now boot our VM and chroot back in to gdb.
$ qemu-system-x86_64 --enable-kvm -cpu host -m 512M -smp 2 -s \
> -snapshot -drive file=hda.img,format=raw &
$ ./gdb.sh
(chroot) (gdb) file /root/linux-6.17_build/vmlinux
(chroot) Reading symbols from /root/linux-6.17_build/vmlinux...
(chroot) (gdb) target remote :1234
(chroot) Remote debugging using :1234
(chroot) (gdb) c
(chroot) Continuing.
On the VM side we will install the ZFS modules and utilities, and then load the zfs kernel module with modprobe (which in turn will load the spl module as a dependency).
(vm) $ cd /root/zfs-2.4.0-rc2
(vm) $ make install
(vm) $ modprobe zfs
Now that the zfs module has been loaded into the kernel, we’ll run the lx-symbols command from the Linux kernel gdb scripts in order to find the debug symbols for the zfs module. In our gdb session we’ll Ctrl+C to pause the kernel process, then add a breakpoint for the zfsdev_ioctl function that is called whenever an ioctl syscall is issued against the /dev/zfs character device.
(chroot) ^C
(chroot) Program recieved signal SIGINT, Interrupt.
(chroot) (gdb) lx-symbols
(chroot) loading vmlinux
(chroot) scanning for modules in /
(chroot) loading @0xffffffffa0c00000: /root/zfs-2.4.0-rc2/module/zfs.ko
(chroot) loading @0xffffffffa084d000: /root/zfs-2.4.0-rc2/module/spl.ko
(chroot) # ...
(chroot) (gdb) b zfsdev_ioctl
(chroot) Breakpoint 4 at 0xffffffffa0ea4f60: file /root/zfs-2.4.0-rc2/module/os/linux/zfs/zfs_ioctl_os.c, line 132.
(chroot) (gdb) c
(chroot) Continuing.
Finally, we can test that our breakpoint is working.
(vm) $ zpool list
(chroot) Breakpoint 1, zfsdev_ioctl (filp=0xffff88800739a900, cmd=23044, arg=140736606537392)
(chroot) at /root/zfs-2.4.0-rc2/module/os/linux/zfs/zfs_ioctl_os.c:132
(chroot) 132 {
(chroot) (gdb) c
(chroot) Continuing.
We were able to set and hit a breakpoint inside the zfs module!
Next steps
The root source of information on debugging the Linux kernel is the official documentation.
For more information on Linux kernel module development, the Linux Kernel Module Programming Guide is fantastic. I also learned a lot by tinkering with this ioctl driver example.
The toybox project includes its own mkroot system to automatically build a complete initramfs from scratch. A lot of my knowledge of initramfs comes from studying mkroot.
Above we used simple raw images for our virtual drives, but qemu also supports the qcow2 (Qemu Copy On Write 2) format. Such drives cannot be mounted as simple /dev/loopX devices, but can be mounted as /dev/nbdX network block devices via qemu-nbd.