From source to binary

Executable code which is possible to run without an operating system can be created from the source code shown in Figure 1. The executable code can be executed on a computer with an Intel-x86 processor, by placing the code in a place where it will be started by boot code. The executable code is created by compiling the source code in Figure 1, and then linking the resulting object code with object code generated from the startup code in Figure 2.

The compilation is done using a compiler for the chosen hardware. For the case of a computer with an Intel-compatible processor, which is running some flavour of the Linux operating system, it is possible to use the native gcc compiler for the compilation.

For the case of a Mac computer with an Intel-compatible processer, which is running the Mac OS X operating system, it is possible to use a variant of the gcc compiler. It may be difficult, though, to get the program to run using the built-in gcc compiler. It is possible to create a compiler for yourself, as described in this article.

It may also be possible to create executable code, which is possible to run without an operating system, on a computer which runs some variant of Microsoft Windows. For this case, it is recommended to use an add-on program which gives the user a Unix/Linux-like environment. Examples of such programs are Cygwin and MinGW.

It will be assumed, throughout the remainder of the book, that if nothing else is stated, the examples shown are compiled and linked using a personal computer with Linux. When knowledge about other environments is available, it will be shared using links to places where additional information is available.

The actual execution of the programs will be done on a computer equipped with a 32-bit Intel-compatible x86-processor, which may, or may not, be the same processor as used for compilation and linking.

As an alternative means for execution, a computer simulator will be used.

Assume that the source code in Figure 1 is stored in a file named bare_metal.c. Assuming that a compiler is available, as described above, the file bare_metal.c can be compiled using the command

gcc -c -m32 -Wall bare_metal.c

In the above command the argument -m32 is used. The purpose of this argument is to instruct the compiler to generate code for a 32-bit Intel-processor. The argument -Wall is also used, for the purpose of enabling all warnings. As a result of the command, an object file named bare_metal.o is generated.

The startup code in Figure 2 is written in assembly language. An assembler can be used for the purpose of translating the assembly code in Figure 2 to object code.

We use the GNU assembler. This assembler is available as a part of the gcc installation, and it can be invoked using the command as.

Assume that the assembly code in Figure 2 is stored in a file named startup.s. The file startup.s can be assembled using the command

as --32 startup.s

An executable program can be created, by combining object files using a linker. For the program discussed here, with source code in C as shown in Figure 1, and startup code in assembly, as shown in Figure 2, the native linker, named ld, can be used.

The program also uses program code for displaying textual information to the user. This program code implement the function console_put_string, which is called from the main-function in Figure 1. The function console_put_string implements its printout by printing a string to a certain position on an alphanumeric screen. The screen is of the type seen during startup of a computer, before the graphics have been activated. A screenshot of the printout is shown below, in Figure 4.

In addition, a file that controls certain aspects of the linking process is used. This type of file is called a linker script, and it contains information that controls how data and code are to be stored in memory when the program is loaded for execution.

Assume that object files have been created, by using an assembler and a compiler as described above. Assume also that the object files are stored in a directory named obj. The linking can then be done by giving a command like

ld -T link.ld --oformat=elf32-i386 -melf_i386 \
-o prog.elf obj/console.o obj/bare_metal.o obj/startup.o  

The above command activates the linker using its command name ld. The activation is done using command arguments and a list of object files to link. The object files are listed at the end, with the prefix obj/, which indicates that they are stored in a directory named obj. As can be seen, there is an object file named console.o. This file corresponds to a file named console.c, with C-code implementing the function console_put_string in Figure 1.

The linking command uses, as argument, a switch -T indicating a linker script to be used in the linking process. Here, the linker script is stored in a file named link.ld. The linker script contains instructions as shown in Figure 3

ENTRY(start)
SECTIONS
{
  . = 0x100000; 
  .startup . : { obj/startup_x86_grub_target.o(.text) }
  .text : {
    *(.text)
  }
  .data  : {
    *(.data)
  }
  .rodata : {
    *(.rodata)
  }
  .bss  :
  { 					
    *(.bss)
  }
}

Figure 3. Linker script for a standalone program, to be executed during startup of the computer.

This the Intel-x86 view - other views are ARM

The linking, which is done using the command ld as shown above, generates a file, here named prog.elf. The file prog.elf is an ELF-file, and it has an ELF-format which is designed to be used with an Intel-x86 processor. The ld command uses the arguments --oformat=elf32-i386 and -melf_i386, for the purpose of generating an ELF-file on the desired format.

The file prog.elf can be placed on a bootable medium, e.g. a bootable USB stick, and then executed, during startup of the computer, on behalf of a boot loader, such as GRUB.

  • Intel-x86
  • ARM