Think Beyond Operating Systems

Beyond all those OS abstractions, your computer is just sand. Well, a large dune of sand compacted to fractions of millimeters of a chip, and you call it the CPU. In the PC world, there are two major companies producing these CPUs; Intel and AMD. Intel has always been at the top although AMD is rising quite well.

At this silicon level, programming varies by different architectures and instruction sets. Popular in the PCs are x86 and x86_64 for 32 bit and 64 bit architectures. They are Complex Instruction Set Computers (CISC) where one single instruction executes multiple low level operations, while the smartphone market has been taken by ARM’s Reduced Instruction Set Computer (RISC) processors where one single instruction processes one single low level operation in one single clock cycle. You must play with these multiple architectures for developing your software to reach more users. Fortunately, and unfortunately for low level knowledge, our OSes encapsulate most of the chip level instructions and expose general functions for us to program with relative ease. To make things worse, high level programming languages further encapsulate these OS level instructions and provide us with English sentences to write software.

We can learn a lot of low level stuff just by sticking to native OS level programming using C++, C or Assembly (as done generally in). You play with memory segments, storage access and more of the functions that your favorite Windows, Mac or Linux exposes. But for today, let’s bid our OSes a goodbye, forget our favorite Java or Python and let’s go assembly deep into bare-metal programming of the silicon chip itself.

The Boot Process

Before starting our bare metal programming, we must understand how computers boot up. Generally we work under Operating Systems for developing software and don’t need to really know anything below it. But as we have no OS now, we need a way to start our computer before we go on to writing software.

So there is a small piece of unalterable (or at least not easily alterable) code residing in the ROM section, known as BIOS.

  1. After the power is ON, the CPU jumps to ROM address where it eventually jumps to the BIOS section.

Cutting off some steps

Since we are not planning to develop our own OS nor are we writing complex and huge software, we can keep everything under 512 bytes and ignore anything else like second stage bootloader, protected mode, kernel etc. Also, we rely on the BIOS interrupts when needed, for simplicity and also we are not creating anything fancy to go past the Real Mode. So, let’s take advantage of the BIOS for this article.

All we actually need is the first bootloader code. We’ll write a simple number guessing game which fits in 512 bytes easily. Also we don’t need to include partition table in the bootloader for this purpose. In a way, we’re not writing a bootloader, but a simple game that first starts the PC.

Project Prototype Model

Memory Access in Real Mode

In a 16 bit Real Mode, we access a memory address using virtual Segment:Offset pair which is mapped to the physical memory address as given by the formula,

Physical Memory Address = 16 * Segment + Offset

For example, the physical memory address 0x7C00 where the bootloader loads, can be accessed by the following segment : offset pair.

Segment = 0x0000 and Offset = 0x7c00 OR Segment = 0x07c0 and Offset = 0x0000

There are two solutions because any of the two leads to the same physical memory. i.e.

a) 16 * 0x0000 + 0x7C00 = 0x00000 + 0x7C00 = 0x07C00

b) 16 * 0x07C0 + 0x0000 = 0x07C00 + 0x0000 = 0x07C00

Note: * 16 in hex is equal to * 10 in decimal. To be completely flat, it just means to append the number by 0. ☺

Prerequisites/Environment Setup

  1. A good knowledge of x86 assembly is highly recommended. I’ll try to explain the instructions at my best, though.

About x86 processor registers

A good read:

The x86, specifically Intel 80386 processor, has 16 registers which are 32 bit in size (with exception to segment registers).

Four general purpose registers (that can be used in any way) are:

  1. Accumulator Register (EAX) is an Extension of its 16 bit AX register which in turn is the combination of low 8 bit (AL) and high 8 bit (AH) registers. Initially intended for storing any intermediate value during arithmetic and logic operations, this register can, however, be used in any way by a system programmer.

Two Stack Related Pointer Registers are:

  1. Stack Pointer Register (ESP) points to the top of the stack. Note that the stack grows downwards in this x86 architecture (and in general). And also the stack element size is exact 32-bit. That means each entry to the stack takes exactly 32 bit of space. Thus, every time something is pushed onto the stack, the stack grows below by 32 bit towards low memory address and that’s where ESP points to. Any pull from the stack makes ESP point to memory address that is 32 bit higher than it is currently pointing at.

Two string index registers are:

  1. Source Index Register (ESI) has its 16 bit equivalent as SI but no further division of 8 bits. This is used to store a source string’s index 0 address (because a string is an array of characters, rather say ASCII bytes, with index starting from 0).

Six 16 bit segment registers are:

  1. Code Segment Register (CS) points to the start of Code segment.

One register is special:

  1. Instruction Pointer Register (EIP) has its 16 bit equivalent IP, which holds the address of the next instruction to be executed. It is also known as a Program Counter. This is special also in a way that it is the only register which can’t be directly accessed in a program.

One 16 bit FLAG register include these flags:

  1. Overflow Flag

Time to some assembly

Below program is the smallest possible bootloader code.

Smallest useless bootloader code for x86 architecture

What useful is a code that just hangs! Let’s make a text equivalent of a game splash screen. Below program, at least displays “Welcome to a number guessing game" before it hangs. We’ll introduce some new commands here.

Bootloader that welcomes us to our game

The XOR of AX with itself sets its value to 0. And we’re storing it in DS. That means the data segment points to 0x0000 address. Next the SI register points to the start of welcome_message string. Here the SI holds offset from the beginning of this segment to the string. We’re setting DS and SI because the LODSB instruction reads a byte from the address pointed by DS:SI (remember Segment:Offset ?), moves the read byte to AL register and advances the SI to next byte. We are OR(ing) AL with itself which results to 0 if AL is 0 and that means end of string is reached. If so, jump to done_printing label and in an infinite loop. Else we call the BIOS interrupt 0x10. BIOS interrupt is a family of a number of sub functions. Here, we are selecting a printing sub function by setting AH to 0x0E. Setting BH to 0 means we’re selecting the first page to print our message byte. You can find more about BIOS INT 10H in Wikipedia. The CLI and STI clears and sets Interrupt Flag(IF) respectively in order to ignore or process the system interrupts during the printing operation is being run.

Before continuing our game, let’s escape out of our current progress to create another simple code that shows how to print numbers. Because console input and output are always ASCII here, we must perform some manipulation to make numbers out of the ASCIIs.

Printing numbers on screen

I’ve included ASCII Chart below for reference.


Let’s do two more examples before appending code to our target project. So far, we’ve printed text and number on screen. Let’s write a program to read user input, again using BIOS interrupt obviously.

Read user input through key press

One last example we discuss is the same “reading user input” but in number. Again, we can’t read number but can manipulate an ASCII character to store it as a number.

Reading number input from user

Finally, let’s continue our game with one of the two program-flow assumptions written below.

Plan #1

  1. User A enters a number between 0 and 9 when User B is not around to see.

Plan #2

  1. ASCII Hex of a fixed number (0–9) is hard-coded in the program. i.e. store 0x39 in the program if the actual number thought is 9.

And I am going with this simple Plan #2 for this game implementation because:

  1. I am feeling too lazy coming towards the end.
Simplest number guessing BARE-METAL game

The extreme FINAL part! Assemble the project with NASM and either running on QEMU emulator or making a bootable USB drive.

Build and Test the raw binary file

Type the following code on command prompt in the same directory where you saved the above “bootloaded_game.asm” file.

nasm bootloaded_game.asm -f bin -o bootloaded_game.bin

To run on QEMU emulator, run the following command,

qemu-system-i386 -hda bootloaded_game.bin

If you want to make a USB bootable with the binary file, do use following commands,

On Linux (Note: replace sda with your USB device name)

dd if=bootloaded_game.bin of=/dev/sda bs=512 count=1

On Windows, you can install and use dd (just change Z: with your USB drive letter assigned by windows)

dd if=bootloaded_game.bin of=\\.\Z: bs=512 count=1

Now boot your computer from the loaded USB drive, next time you power your PC.

Enthusiast Software Engineer with strong desire to contribute to the Software Industry.