Demilade Sonuga's blog

All posts

Interlude A: Some Potential Problems In The Project

2023-04-05 · 21 min read

Now that we've gotten down our heap allocator and Box, we're going to take a step back and take a look at the forest instead of the individual trees (and some potential problems with this forest).

Overview of The Project So Far

We've come a long way. In the prologue, a list of 12 things was specified and we've reached number 9 on the list--interrupts and event handling. The uefi.rs module holds the models that we use to interact with the UEFI firmware. To model bitmap files and make them easier to handle, we created bitmap.rs. The game.rs module holds the game code, which, for now, just draws the initial game on screen using the models in bitmap.rs and the screen model in uefi.rs. The font.rs module holds the font descriptions that we use to write letters on the screen, which has been useful in panicking and simple communication. The main.rs holds the driver code that sets up and runs the game.

To make the game respond to key presses and timer events, we need interrupts. It's when interrupts come in that Intel's structures enter the picture. To handle interrupts, we modeled the Interrupt Descriptor Table, wrapped some assembly instructions, and threw them into interrupts.rs. The other structures we needed to get interrupts running are in gdt.rs and tss.rs. To interact with the Programmable Interrupt Controllers, we modeled them in pics.rs. To interact with computer IO ports, we modeled ports and wrapped their instructions in the port.rs module. To process the bytes that tell which keys have been pressed/released during keyboard interrupts, we created keyboard.rs.

To add event handlers, we started working on event_hook.rs. It is while thinking through this that we realized we needed a Vec and a Box for handling the event functions. To have a Vec and a Box, we need heap memory, and an allocator for it, so we created those. We've created the Box, too, so all that is left to get on with event_hook is a Vec.

At this point, it would be nice to look at some potential problems with what we've done.

For one thing, we created heap memory, but where does our stack memory come from? What is the state of our stack memory and how do we know whether or not it's enough for our purposes? Mutable statics are unsafe so why should so many of them be in our code?

Stack Memory

Consider a chunk of memory:

 0 1 2 3 4 5 6 7 8 9 
| | | | | | | | | | |

The above represents 10 bytes of memory addressed from 0 to 9. On Intel's x86-64 systems, a register (remember this means high-speed memory in the CPU) called rsp holds the address of the stack's top. So, if rsp holds the number 5, then the next local variable that is placed on the stack will be placed in memory starting at address 9.

The stack grows downwards, so if three u8s are allocated on the stack then the first will be placed at address 5:

 0 1 2 3 4 5    6 7 8 9 
| | | | | |Used| | | | |

rsp <- 4

After that placement, the stack top decreases to 4. Then the second u8 is placed at address 4.

 0 1 2 3 4 5    6 7 8 9 
| | | | |Used|Used| | | | |

rsp <- 3

The stack top decreases to 3. Then the third u8 is placed at address 3.

 0 1 2 3    4    5    6 7 8 9 
| | | |Used|Used|Used| | | | |

rsp <- 2

rsp ends up with address 2, meaning the stack's top is now address 2. This is not a very precise statement of what goes on to allocate space for local variables, but it's very close.

After the function that uses these three bytes returns, the rsp is increased back to where it was before, namely address 5:

 0 1 2 3 4 5 6 7 8 9 
| | | | | | | | | | |

rsp <- 5

This implies that the stack top will now be 5 and any local variable to be allocated will use memory starting at address 5. The diagram above implies that the bytes used in addresses 3, 4, and 5 are cleared but they're really not. The data put there will remain there until they're overwritten. This has some serious implications: for one, it means that we can get the impression that everything is alright, simply because we can see the data, even when it's not.

This (very) simplified description demonstrates the allocation and deallocation involved in stack memory. The higher the number of functions that have been called in a chain and the bigger the sizes of their location variables, the more the stack memory that will be allocated at once, because of its linear nature.

For example:

fn function1() {
    let local_variable1: 8 bytes = ...;
    function2();
}

fn function2() {
    let local_variable: 128 bytes = ...;
    function3();
}

fn function3() {
    let local_variable: 1Kib = ...;
}

fn main() {
    function1();
}

In the above, when function1 is called, 8 bytes will be allocated on the stack for its local variable. When the call to function2 is encountered, another 128 bytes will be allocated for its local variable. When the call to function2 is encountered, another 1 Kilobyte will be allocated for its local variable.

To accommodate this, 8 bytes + 128 bytes + 1Kib of stack memory will be needed.

But if the functions weren't called in chains, if they were just called one after the other in main without any extra function calls, then only 1Kib of stack memory will be needed, because after function1 finished executing, the function will return and the 8 bytes will be deallocated. After function2 finishes executing, the 128 bytes will also be deallocated. When it's time for function3 to execute, the stack memory will be completely freed.

Following this line of reasoning, the amount of stack memory we need is determined by the largest amount of local variables that need to be kept track of at once.

According to the UEFI spec, we have 128 Kib of stack space when our code starts running (For more information on this, check the references under "What the firmware sets up for x86-64 (x64 systems)"). Remember that the firmware is the first program to start running when the computer comes on. It's this firmware that set up the rsp register for us to point to the last address in a 128Kib chunk of memory. The question now is: how do we know if this 128Kib of data is enough stack space? What if it's not enough and other data is being silently overwritten? This will definitely lead to bugs that will manifest themselves as something else later on.

For now, I think we can ignore this (emphasis on "I think"), but it will be good to keep this at the back of your mind, in case we encounter mysterious bugs. It just could be this. (Or we could compute the amount of stack space we need by adding together the sizes of all the local variables in chains of function calls and see if the biggest one is bigger than 128Kib).

Mutable Statics And Data Races

Unsafe, unsafe, unsafe. Unsafe. Everywhere. Mutable statics are very unsafe. Unsafe to read. Unsafe to write. Why? Global variables are almost always a bad idea. Any code anywhere at all could read and modify it. It's like all the functions own the variable at the same time. It's very easy for this to lead to undefined behavior.

But there is so much of it:

In main.rs

static mut SCREEN: Option<&mut Screen> = None;

static mut GDT: Option<GDT> = None;
static mut TSS: Option<TSS> = None;
static mut IDT: Option<IDT> = None;
static mut PICS: Option<PICs> = None;
static mut KEYBOARD: Option<Keyboard> = None;

Surely there must be a better way! We'll get to that later when we're refactoring. But for now, we need to understand the danger of doing stuff like this.

Mutable statics can only be made safe in a completely single-threaded scenario, and by single-threaded, I mean when an instruction is executing, no other instruction can be executed.

But in our code, there is still a form of concurrency: interrupts. When code is executing and you press a key on the keyboard, that executing code is immediately stopped and the interrupt service routine for the keyboard is executed.

So, if executing code is in the middle of a process of updating some entry in the IDT and an interrupt occurs, then the process is stopped halfway and the interrupt service routine is executed. If this interrupt service routine accesses the entry that was being updated, then undefined behavior will be the result, because, at that point, the entry will be garbage. This is what we call a data race. It's what happens when multiple lines of execution attempt to access a shared resource without restraints (which is exactly what mutable statics provide for us). And one of its defining features is unpredictability. In one run, the result may be complete undefined behavior. In another, the modification may finish quickly enough, so no error occurs.

Box Problems

Another potential problem we may encounter is the Box's drop implementation. What if the Box held another Box. drop is never called on the Box's contents, so the memory handled by the inner Box will never be released, so free memory will never be recorded as free again. Enough of this and a point will be reached where there is enough free memory but allocation will fail because the allocator believes that the memory is still being used. This is what is called a memory leak: Incorrect memory management leading to free memory that is never deallocated.

There are still other problems, but I'll leave that to you to check out.

Take Away

  • The UEFI firmware sets up 128Kib of stack memory for x86-64 systems on computer start-up.
  • Stack memory allocation and deallocation are implicit and happen from only one end.
  • Mutable statics can lead to data races.

In The Next Post

We'll get started with the Vec.

References

  • What the firmware sets up for x86-64 (x64) systems: UEFI spec, version 2.7, section 2.3.4.

You can download the UEFI spec here