Compiling binary executables

February 22, 2024  |  asm c zig 

Almost all compiled languages generate binary executables via the following pipeline.

text -> AST -> IR -> machine code

Here text refers to code written in the programming language in question. In this article we will follow how text passes through the pipeline to produce executable binary machine code.

All examples will be written in C99 and target x86_64 Linux. To follow along with every example you will need gcc, make, and binutils for the ld linker, as assembler, and ar archive utility. The LLVM toolchain is mentioned, but not necessary.

$ mkdir linking_examples
$ cd linking_examples

You will also need the cproc C compiler, the QBE backend, and musl libc. On most Linux systems these can be built from source with make and gcc as follows.

$ git clone --depth=1 git://c9x.me/qbe.git
$ cd qbe
$ make
$ make install
$ cd ..
$ git clone --depth=1 https://git.sr.ht/~mcf/cproc
$ cd cproc
$ ./configure
$ make
$ make install
$ cd ..
$ git clone --depth=1 https://git.musl-libc.org/git/musl
$ cd musl
$ ./configure
$ make
$ cd ..

You don’t need to run make install for musl since the examples will all use musl/lib in the working directory. Technically, you don’t need to run make install for qbe or cproc either:

  • you could add them to your path, e.g. export PATH="/path/to/cproc:$PATH";
  • or just run the local executables, e.g. use ./cproc/cproc-qbe instead of cproc-qbe.
$ echo "
int collatz(int n) {
    if (n % 2 == 0) {
        return n / 2;
    }
    return 3 * n + 1;
}
" > collatz.c
A C function that computes a step of the Collatz function

An abstract syntax tree or AST is a data structure that is built to represent parsed text code. While AST construction is an important compilation step, it usually is not of too much use to compiler users like us. The LLVM toolchain is the only popular C compiler I know of that will dump a human readable AST.

$ clang -E -Xclang -ast-dump collatz.c | clean-up-by-hand
`-FunctionDecl collatz "int (int)"
  |-ParmVarDecl used n "int"
  `-CompoundStmt
    |-IfStmt
    | |-BinaryOperator "int" "=="
    | | |-BinaryOperator "int" "%"
    | | | |-ImplicitCastExpr "int" <LValueToRValue>
    | | | | `-DeclRefExpr "int" lvalue ParmVar "n" "int"
    | | | `-IntegerLiteral "int" 2
    | | `-IntegerLiteral "int" 0
    | `-CompoundStmt
    |   `-ReturnStmt
    |     `-BinaryOperator "int" "/"
    |       |-ImplicitCastExpr "int" <LValueToRValue>
    |       | `-DeclRefExpr "int" lvalue ParmVar "n" "int"
    |       `-IntegerLiteral "int" 2
    `-ReturnStmt
      `-BinaryOperator "int" "+"
        |-BinaryOperator "int" "*"
        | |-IntegerLiteral "int" 3
        | `-ImplicitCastExpr "int" <LValueToRValue>
        |   `-DeclRefExpr "int" lvalue ParmVar "n" "int"
        `-IntegerLiteral "int" 1
A cleaned up AST from LLVM for our collatz.c file

IR stands for intermediate representation: code designed to run on a simple abstract machine. The IR step is mainly used to perform analysis and optimizations in an architecture agnostic environment. Multiple IR stages may be involved in compilation, e.g. a compiler might have a bespoke internal IR which is converted to a separate IR used by a machine code generation backend (e.g. LLVM or QBE).

$ cproc-qbe collatz.c > collatz.ssa
$ cat collatz.ssa
export function w $collatz(w %.1) {
@start.1
	%.2 =l alloc4 4
	storew %.1, %.2
@body.2
	%.3 =w loadw %.2
	%.4 =w rem %.3, 2
	%.5 =w ceqw %.4, 0
	jnz %.5, @if_true.3, @if_false.4
@if_true.3
	%.6 =w loadw %.2
	%.7 =w div %.6, 2
	ret %.7
@if_false.4
	%.8 =w loadw %.2
	%.9 =w mul 3, %.8
	%.10 =w add %.9, 1
	ret %.10
}
The QBE IR generated by cproc for collatz.c

Finally, machine code is a collection of binary data and CPU instructions that might eventually be loaded into memory and run on a physical (or virtual) machine. A human readable assembly language version of the output from the machine code step may also be generated.

$ qbe -o collatz.s collatz.ssa
$ cat collatz.s
.text
.globl collatz
collatz:
	pushq %rbp
	movq %rsp, %rbp
	movl %edi, %eax
	movl $2, %esi
	movl %eax, %ecx
	cltd
	idivl %esi
	movl %ecx, %eax
	movl %edx, %ecx
	cmpl $0, %ecx
	jz .Lbb3
	imull $3, %eax, %eax
	addl $1, %eax
	jmp .Lbb4
.Lbb3:
	movl $2, %ecx
	cltd
	idivl %ecx
.Lbb4:
	leave
	ret
.type collatz, @function
.size collatz, .-collatz
The x86_64 assembly generated by QBE from the above IR for collatz.c

The binary generated by the basic compilation process is usually an organized collection of machine code called an object file. On Linux object files use the ELF file format. Assembly language can be re-constructed from binaries, a process called disassembly.

$ as -o collatz.o collatz.s
$ objdump -da collatz.o
collatz.o:     file format elf64-x86-64
collatz.o

Disassembly of section .text:

0000000000000000 <collatz>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	89 f8                	mov    %edi,%eax
   6:	be 02 00 00 00       	mov    $0x2,%esi
   b:	89 c1                	mov    %eax,%ecx
   d:	99                   	cltd
   e:	f7 fe                	idiv   %esi
  10:	89 c8                	mov    %ecx,%eax
  12:	89 d1                	mov    %edx,%ecx
  14:	83 f9 00             	cmp    $0x0,%ecx
  17:	74 08                	je     21 <collatz+0x21>
  19:	6b c0 03             	imul   $0x3,%eax,%eax
  1c:	83 c0 01             	add    $0x1,%eax
  1f:	eb 08                	jmp    29 <collatz+0x29>
  21:	b9 02 00 00 00       	mov    $0x2,%ecx
  26:	99                   	cltd
  27:	f7 f9                	idiv   %ecx
  29:	c9                   	leave
  2a:	c3                   	ret
A disassembly of the generated collatz.o object file

Object files contain a symbol table section that stores names and types for select pieces of machine code. Each named entry in the symbol table is called a symbol. In C a symbol will be generated for each function and global variable not declared with the static keyword.

$ readelf -s collatz.o
Symbol table '.symtab' contains 2 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000    43 FUNC    GLOBAL DEFAULT    1 collatz
The symbol table for collatz.o contains the exported collatz function symbol

An executable is an object file with a designated place to start program execution called an entry point. In C it is common for text files to be separately compiled into object files, then the object files are combined to form a single executable.

$ echo "
int printf(const char *fmt, ...);
int collatz(int n);

int main() {
    printf("collatz(5) = %d\n", collatz(5));
    return 0;
}
" > print_collatz.c
$ cproc-qbe print_collatz.c | qbe | as -o print_collatz.o
$ ld -o print_collatz print_collatz.o collatz.o musl/lib/crt1.o musl/lib/libc.a
$ ./print_collatz
collatz(5) = 16
Linking an executable from print_collatz.o, collatz.o, and musl libc

The process of combining object files together into a single binary is called linking or static linking1. Most C compilers will automatically link object files for the C standard library (libc.a) and the C runtime (crt1.o)2. The C runtime defines an entry point that calls the main function.

$ readelf -a print_collatz.o | grep "Entry point"
  Entry point address:               0x0
$ readelf -a print_collatz | grep "Entry point"
  Entry point address:               0x40106e
An executable has a non-null entry point

Object files can contain references to symbols that they do not define. For example, our print_collatz.o file refers to two undefined symbols: collatz and printf. During linking the collatz symbol is defined in collatz.o and printf is provided by libc.a.

$ readelf -s print_collatz.o
Symbol table '.symtab' contains 5 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 .data
     2: 0000000000000000    40 FUNC    GLOBAL DEFAULT    1 main
     3: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND collatz
     4: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND printf
The symbol table for print_collatz.o

Sometimes several object files are combined into a single archive file or static library for convenient re-use. The musl implementation of the C standard library compiles into the static libc.a library used above.

$ echo "
int fibonacci(int n) {
    if (n <= 1) return 1;
    return fibonacci(n - 1) + fibonacci(n - 2);
}
" > fibonacci.c
$ cproc-qbe fibonacci.c | qbe | as -o fibonacci.o
$ ar -rs libfunmath.a collatz.o fibonacci.o
$ echo "
int printf(const char *fmt, ...);
int collatz(int n);
int fibonacci(int n);

int main() {
    printf("collatz(5)   = %d\n", collatz(5));
    printf("fibonacci(5) = %d\n", fibonacci(5));
    return 0;
}
" > print_funmath.c
$ cproc-qbe print_funmath.c | qbe | as -o print_funmath.o
$ ld -o print_funmath print_funmath.o libfunmath.a musl/lib/crt1.o musl/lib/libc.a
$ ./print_funmath
collatz(5)   = 16
fibonacci(5) = 8
Creating a static library and linking it into an executable

An executable that was created using only static linking is called a static executable. The big strength of static executables is that they are simple and self contained: as long as two computers share the same CPU architecture and operating system, a static executable can generally be copied from one machine to the other and run without additional setup.

Sometimes additional metadata will be passed down between compilation stages in order to preserve information that would otherwise be lost. E.g. gcc debug builds (enabled with the -g flag) use the DWARF data format to track the name and line number of the text file from which each piece of machine code originated (and a lot of other things).

$ gcc -g -c -o collatz_dbg.o collatz.c
$ gcc -g -c -o print_collatz_dbg.o print_collatz.c
$ ld -o print_collatz_dbg collatz_dbg.o print_collatz_dbg.o \
    musl/lib/crt1.o musl/lib/libc.a
$ gdb print_collatz_dbg
$(gdb) break main
$(gdb) run
Breakpoint 1, main () at print_collatz.c:5
5     printf("collatz(5) = %d\n", collatz(5));
$(gdb) step
collatz (n=5) at collatz.c:2
2     if (n % 2 == 0) {
$(gdb) step
5     return 3 * n + 1;
$(gdb) step
6   }
$(gdb) step
collatz(5) = 16
main () at print_collatz.c:6
6     return 0;
$(gdb) step
[Inferior 1 (process 367249) exited normally]
Creating a debug executable and stepping through it with gdb

While static linking with libraries is one option for code re-use, an alternative is to compile library code into a special shared library binary, also known as a dynamic library. Shared library code is not included directly in the linked executable binary, instead it is loaded into memory at runtime.

There are two primary ways that shared libraries are used: dynamic linking and dynamic loading3.

$ ld -shared -o libfunmath.so fibonacci.o collatz.o
$ ld --dynamic-linker musl/lib/libc.so -L. -rpath=. -Lmusl/lib -rpath=musl/lib \
    -lc -lfunmath -o print_funmath_dyn print_funmath.o musl/lib/crt1.o
$ ./print_funmath_dyn
collatz(5)   = 16
fibonacci(5) = 8
Creating a shared library and linking it into a dynamic executable

In dynamic linking a dynamic executable lists a path to a binary called a dynamic linker or program interpreter along with a path to each linked shared library. At runtime the dynamic linker will be used load each shared library into memory and locate the necessary symbols.

$ readelf -a print_funmath_dyn | grep "interpreter"
      [Requesting program interpreter: musl/lib/libc.so]
$ readelf -a print_funmath | grep "Shared library"
 0x0000000000000001 (NEEDED)             Shared library: [libc.so]
 0x0000000000000001 (NEEDED)             Shared library: [libfunmath.so]
$ readelf --dyn-syms print_funmath_dyn
Symbol table '.dynsym' contains 7 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND collatz
     3: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND _init
     4: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND fibonacci
     5: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND _fini
     6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main
A dynamic ELF executable has a program interpreter, shared libraries, and a dynamic symbol table

One benefit of dynamic linking is memory savings: the dynamic linker can sometimes have just one copy of each unique library loaded at a given time, shared between executables4. A shared library can also be patched without necessarily having to re-compile every executable that depends on it.

An executable also has the option to manually load and link a shared library while it is running. This is often called dynamic loading and it is useful for purposes such as hot code swapping, i.e. updating a program’s code while it is running.

$ echo "
int open(const char *fname, int flags, int mode);
long read(int fd, char *buffer, long len);

int inotify_init(void);
int inotify_add_watch(int fd, const char *pname, int mask);

void *dlopen(const char *fname, int flags);
void *dlsym(void *restrict handle, const char *restrict symbol);
int dlclose(void *handle);

int printf(const char *fmt, ...);

int main() {
    void *lib_handle;
    int (*foo)(int);
    struct { int wd, mask, cookie, len; char name[128]; } event;

    int infd = inotify_init();
    inotify_add_watch(infd, "./lock", 0x200);

    while (1) {
        lib_handle = dlopen("./libfoo.so", 2);
        foo = dlsym(lib_handle, "foo");
        printf("foo(5) = %d\n", foo(5));
        read(infd, (char *)&event, sizeof(event));
        dlclose(lib_handle);
    }
}
" > hot_reload.c
$ cproc-qbe hot_reload.c | qbe | as -o hot_reload.o
$ ld --dynamic-linker musl/lib/libc.so -Lmusl/lib -rpath=musl/lib -lc
    -o hot_reload hot_reload.o musl/lib/crt1.o
$ mkdir lock
$ echo "int foo(int n) { return 2 * n; }" > foo.c
$ cproc-qbe foo.c | qbe | as -o foo.o
$ ld -shared -o libfoo.so foo.c
$ ./hot_reload.c &
foo(5) = 10
$ touch lock/foo
$ echo "int foo(int n) { return 3 * n; }" > foo.c
$ cproc-qbe foo.c | qbe | as -o foo.o
$ ld -shared -o libfoo.so foo.c
$ rm lock/foo
foo(5) = 15
Using a lock file to re-load a dynamic library during runtime

Note that in general dynamic loading is only supported for dynamic executables.