A Few Things About Safety In Rust

In the previous post, we cleaned up some of our code but there are still some problems with it that could be solved with a few changes. To solve those problems, we need to understand a few things about safety in Rust.

The essence of Rust is to protect us from ourselves. Its many rules and sometimes seemingly unnecessary hassle
serve that one purpose. To get an idea of what I'm talking about here, consider the following pseudo-code:

struct Student {
    name: string,
    class: string,
    age: number
}

let ptr = 10000 as pointer to Student
print(*ptr.name)

In the above pseudo-code, some random memory location is being interpreted as a pointer to an instance of the Student struct. That is, the bits in memory starting from the address 10000 are to be interpreted as a string, followed by another string, followed by a number. On the next line, the pointer is dereferenced and the bits in memory which are supposed to be the name string are passed to a print function.

The problem with this code: What if the bits at location 10000 aren't an instance of the Student struct? What if we're misinterpreting those bits? If we are, then we will be screwing ourselves up. In a much bigger code base, a problem like that could lead to hours of bug hunting, hours that could have been used for other things. An even worse scenario is when the program seems to be working and the bug goes undetected until much later when it has silently caused mysterious bugs in some other aspect of the code. Consider this modification of the code:

struct Student {
    name: String,
    class: String,
    age: number
}

let ptr = 10000 as pointer to Student
*ptr.age = 19

The above code modifies the age field of the bits behind the pointer. If that address 10000 was the address of some other important data the application containing this code relies on, then this misinterpretation of the address and the *ptr.age = 19 will corrupt that data.

Sometimes, we programmers are smart but not smart enough. We make mistakes. Perhaps address 10000 actually holds a valid Student instance. What if a programmer working with this code makes a mistake and writes 1000 instead of 10000? The result will be hours of headaches because of one omission.

We as the programmers, have to ensure that the address we're interpreting as a pointer to a Student instance is actually a pointer to a Student instance. This can be hard and error-prone.

When faced with such problems, this is when Rust enters. Rust helps us by lifting the burden of verifying we aren't screwing ourselves up off our shoulders. In Rust, every reference is statically guaranteed to always point to valid data. And what exactly do I mean by that? I mean that, with its references and the rules governing its use, Rust can make sure that we don't make mistakes like this at compile time (while using references).

Sometimes, we need still need to interpret some arbitrary memory location as a pointer to data of some type. Take a look at our efi_main signature:

extern "efiapi" fn efi_main(
    handle: *const core::ffi::c_void,
    sys_table: *mut SystemTable,
) -> usize {
    // ... Others
}

The second argument of the efi_main function is an address, or more specifically, a raw pointer to an instance of SystemTable. How do we know that the data this pointer points to is actually an instance of SystemTable? We know from the UEFI spec. But, there is no way for Rust to know that. Generally, when we're doing things that Rust can't protect us from, we have to use unsafe blocks.

When we obtain the BootServices instance from the table:

let boot_services = unsafe { (*sys_table).boot_services };

we use an unsafe block. This is unsafe because there is no way for Rust to statically verify that the address interpreted as a SystemTable instance is actually a SystemTable instance. It might as well just be a bunch of random bits, or some important data being used somewhere else in the project.

This unsafe pointer dereferencing is in several places in our code and that is a problem. The more unsafe, the less the compiler is there to help when we make mistakes and screw up. This is why one of the main philosophies of Rust when writing code is to use limit unsafe code as much as possible. The less unsafe code, the more the compiler has our back. And when something does go wrong (safety-wise), we can trace it back to the unsafe code.

The right way to approach this task of ours is to write unsafe code in a small limited area, manually verify that this unsafe code is definitely correct and immediately transition back to safe code, where the compiler resumes this stressful job of verification.

Take Away

The essence of Rust is to protect us from ourselves.
The more unsafe code, the less Rust helps.

In The Next Post

We'll be refactoring to limit the unsafety in our code

References

https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html
https://doc.rust-lang.org/nomicon/safe-unsafe-meaning.html