Demilade Sonuga's blog
All postsGetting Started II
In the previous post, we learned a few things about how the computer starts up and about
how our entry point will eventually be called. We need to replace our main
function in our
main.rs
file with an efi_main
function.
Your main.rs
file should be looking like this now:
#![no_std]
fn main() {
}
#[panic_handler]
fn panic_handler(panic_info: &core::panic::PanicInfo) -> ! {
loop {}
}
We now change it to this:
#![no_std]
#![feature(abi_efiapi)] // NEW
// DELETED: fn main() {
// DELETED: }
// NEW
extern "efiapi" fn efi_main(handle: *const core::ffi::c_void, sys_table: *mut u8) -> usize {
0
}
#[panic_handler]
fn panic_handler(panic_info: &core::panic::PanicInfo) -> ! {
loop {}
}
Okay. Now, we've replaced the main
function with the efi_main
which rustc will be expecting
as the entry point for our game. There are a few things I haven't mentioned anything about yet.
Firstly, the #![feature(abi_efiapi)]
. To understand why we need this, we first take a quick detour
to understand calling conventions.
Calling Conventions
Consider a scenario: you are building a library in assembly (why you would do this is beyond me) and you're writing a bunch functions to subtract one number from another (why you're writing different variants of a subtraction function is, again, beyond me). In a high level language, the functions you're writing looks like this
fn f(a, b)
return a - b
fn g(a, b)
return a - b
We have to realize that on the machine level, there are no variables or arguments, no a
and b
.
All we have down there are memory and some named registers (very fast memory in the processor). On x86_64,
some registers are rax
, rbx
, rcx
, rdx
, ....
On a normal day, it's our compiler that determines which registers and memory will be used to hold the
values identified by a
and b
. But remember, we're doing this in assembly so there's no compiler to
do that for us.
"That's okay", we tell ourselves. For function a
, we write this function beginning with the assumption that
the value a
will be in the register rax
and that the value b
will be in the register rbx
.
The assembly for our f
function will now look like this
f:
sub rax, rbx
ret
meaning subtract rbx
from rax
, store the result in rax
, then return back to the function caller.
From the assumptions we had while writing this function, this function does a
- b
and stores the result in rax
. So, implicitly, rax
holds the function's return value.
After ret
returns from the function, whichever code called this f
function should now be expected
to check rax
for the return value.
Okay, we've written our f
function. Now, let's write g
. For g
, we're feeling a little more creative.
Instead of assuming arguments will be in rax
and rbx
, respectively, we'll
begin with a different assumption, that the value a
will be in the register rbx
and the value b
will be in the register rax
.
The assembly for our g
function will now look like this
g:
sub rbx, rax
ret
meaning subtract rax
from rbx
, store the result in rbx
, then return. From the assumptions we began
with, this function does a
- b
and, implicitly, rbx
holds the function's return value.
And finally, we're done. Our library now has two functions which take two numbers and subtracts one from another.
One month later, we're writing another assembly program to do some complex computation, and we find ourselves in a position where we would like to use our library. A high level snippet from our code looks like this
...
rax <- 10
rbx <- 5
call f
result <- rax
Do something with result
...
rax <- 10
rbx <- 5
call g
result <- rax
Do something with result
As the above shows, we call the functions f
and g
, beginning with the assumptions that the functions
expect our a
to be in rax
, our b
to be in rbx
and the result will be in rax
.
After we're done with the code, we get a nasty shock that a
- b
with f
gives a result of 5 and
a
- b
with g
gives a result of -10000000.
However did we go wrong? We spend hours and hours and hours searching through our code,
then we find the culprit: the function g
begins with a whole bunch of different assumptions from
the function f
. In fact, g
has a completely different contract from f
.
f
and g
expect the same values in different places. f
wants arguments in rax
and rbx
,
respectively, while g
wants arguments in rbx
and rax
, respectively. f
returns values in rax
while g
returns values in rbx
.
Now it makes sense. Simply because of the way we passed the arguments, g
will compute b
- a
instead
of a
- b
, because we passed the arguments in the reversed order of what g
was expecting. The
result
that we took from rax
wasn't even the g
's result at all. g
's result is in rbx
!
If we caused this much pain for ourselves, just because we wrote two different functions with two different assumptions about where arguments and return values should be, you can imagine how hard it will be for someone else to use this library. They'll have to remember the different contracts for each of the functions in the library. A typical project in this 21st century could use several different libraries written by several different programmers in several different places from all around the world. Imagine having to remember the contracts for all their functions. Why would we waste time doing that when we can be writing code?
On a regular day, compilers may be doing this low level code generation for us, but we have to remember
that there are different compilers and different languages. How would one compiler know that some external
code it's interfacing with expects the first two arguments in rax
and rbx
and not the other way around?
The answer to this problem is to just round everyone together and formally specify where all these arguments and return values and other function interface stuff are expected . Just write it all down and put it online for everyone to see. From the convention specified, the programmers can now make the right assumptions on how to interface with a function. Compiler writers can now add features to their compilers that allow programmers to write high level functions that will generate code that makes the right assumptions about some external interface that code is going to be interacting with.
This formal specification/contract of the right assumptions to be made about how to give what a function expects and how to receive what is expected from it is the function's calling convention.
abi_efiapi
& extern efiapi
Wow, that was a mouthful. Okay, now that we know and understand what calling conventions are, we
should have a much better idea now of why we need that abi_efiapi
feature. In the previous post, we
learned that the UEFI firmware calls the entry point function in our application.
Well, it seems that there was something else that was missed out. When the firmware calls our function,
it passes two arguments to it: the image handle and the system table, respectively (Knowing what exactly
these two things are can wait). It also expects our function to return with a number.
Which registers, exactly, are these arguments and return values going to be in? We tell rustc to
determine this for us with the extern "efiapi"
in front of our fn efi_main
.
At the moment the feature which allows us to use this extern "efiapi"
is unstable in the compiler,
meaning that it's still open to a lot of change. To use the extern "efiapi"
, we have to opt-in to
use the unstable feature, hence the #![feature(abi_efiapi)]
.
Building Again
Another build:
[demilade@fedora blog-blasterball]$ cargo build
Compiling blasterball v0.1.0 (/home/demilade/Documents/blog-blasterball)
error[E0601]: `main` function not found in crate `blasterball`
--> src/main.rs:11:2
|
11 | }
| ^ consider adding a `main` function to `src/main.rs`
For more information about this error, try `rustc --explain E0601`.
error: could not compile `blasterball` due to previous error
Okay, rustc is still expecting us to have a main
function. We tell it stop expecting one with #![no_main]
.
#![no_std]
#![no_main] // NEW
#![feature(abi_efiapi)]
extern "efiapi" fn efi_main(handle: *const core::ffi::c_void, sys_table: *mut u8) -> usize {
0
}
#[panic_handler]
fn panic_handler(panic_info: &core::panic::PanicInfo) -> ! {
loop {}
}
Another build
error: linking with `rust-lld` failed: exit status: 1
|
= note: "rust-lld" "-flavor" "link" "/NOLOGO" "/entry:efi_main" "/subsystem:efi_application" "/tmp/rustcVwoHAt/symbols.o" "/home/demilade/Documents/blog-blasterball/target/x86_64-unknown-uefi/debug/deps/blasterball-1595e28c28203baa.4fuqjhq87bc4qh3x.rcgu.o" "/LIBPATH:/home/demilade/Documents/blog-blasterball/target/x86_64-unknown-uefi/debug/deps" "/LIBPATH:/home/demilade/Documents/blog-blasterball/target/debug/deps" "/LIBPATH:/home/demilade/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-uefi/lib" "/home/demilade/Documents/blog-blasterball/target/x86_64-unknown-uefi/debug/deps/librustc_std_workspace_core-fd3a90d976264b72.rlib" "/home/demilade/Documents/blog-blasterball/target/x86_64-unknown-uefi/debug/deps/libcore-c4c2c332d92fdfa4.rlib" "/home/demilade/Documents/blog-blasterball/target/x86_64-unknown-uefi/debug/deps/libcompiler_builtins-d5d94858b09c8135.rlib" "/NXCOMPAT" "/LIBPATH:/home/demilade/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-uefi/lib" "/OUT:/home/demilade/Documents/blog-blasterball/target/x86_64-unknown-uefi/debug/deps/blasterball-1595e28c28203baa.efi" "/OPT:REF,NOICF" "/DEBUG" "/NODEFAULTLIB"
= note: rust-lld: error: <root>: undefined symbol: efi_main
warning: `blasterball` (bin "blasterball") generated 4 warnings
error: could not compile `blasterball` due to previous error; 4 warnings emitted
Linker errors this time. Now it's telling us it can't find our efi_main
function. The linker is the
big guy in charge of resolving names and symbols and determining some addresses. The linker saying no symbol
efi_main
must mean that no such symbol exists in the compiler's output.
To get rid of this error, we first need to understand why the symbol efi_main
doesn't exist in the
compiler output. It turns out that rustc doesn't just spit out function names the way it sees them
in the code. It first does something called name mangling. That is, it adds a bunch of gibberish to
the function name before putting it in the output. For example, the name of our efi_main
function
in the output can look something like: _ZN11blasterball8efi_main17ha8e4312f68056142E
. This is done to
implement namespaces, the idea that two functions with the same name can be in different modules. On the
lower level that the compiler is generating code for, such luxuries do not exist. Name mangling is just
a way of creating the illusion that it does.
To stop the compiler from mangling our efi_main
function, we have to annotate it with the #[no_mangle]
attribute
#![no_std]
#![no_main]
#![feature(abi_efiapi)]
#[no_mangle] // NEW
extern "efiapi" fn efi_main(handle: *const core::ffi::c_void, sys_table: *mut u8) -> usize {
0
}
#[panic_handler]
fn panic_handler(panic_info: &core::panic::PanicInfo) -> ! {
loop {}
}
At this point, building our project gives a bunch of warnings, but no errors.
Finally, no errors.
Running Our Game
We've finally built our project and now we want to see it in action. We could run it on our hardware, but it's much more convenient for development that we do so in an emulator. For this project, we'll be using qemu, a FOSS (Free and Open Source Software) emulator. You can get it from here.
We also need to emulate UEFI firmware. To do this, install edk2-ovmf
sudo dnf install edk2-ovmf
The next step is to find the OVMF_CODE.fd
and OVMF_VARS.fd
files. It will be in the
installation directory of the edk2-ovmf files. I'm not sure of what it will be on other systems,
but on my Linux Fedora, it's in /usr/share/edk2/ovmf/.
In the project root directory, run
qemu-system-x86_64 \
-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd,readonly=on \
-drive if=pflash,unit=1,format=raw,file=/path/to/OVMF_VARS.fd \
-drive format=raw,file=fat:rw:target/x86_64-unknown-uefi/debug/
If it gives you a permission denied error, you may have to run it with sudo
for root privileges:
sudo qemu-system-x86_64 \
-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd,readonly=on \
-drive if=pflash,unit=1,format=raw,file=/path/to/OVMF_VARS.fd \
-drive format=raw,file=fat:rw:target/x86_64-unknown-uefi/debug/
then enter your password.
After a few seconds, you should have this on screen
In the cmd prompt, type fs0:blasterball.efi
, then press enter.
After you press enter, you must have noticed that nothing happened. Well, that's because our code doesn't do anything. It just returns.
But it ran.
And we understand all that led up to this point.
You should relax, celebrate a little.
Take Away
- Functions pass arguments and returns values according to a pre-specified calling convention, which is contract that tells function callers and callees what to expect in registers and memory.
- The efiapi calling convention is what the rust compiler uses to generate code that conforms to the UEFI's calling convention.
- The
#[no_mangle]
attribute is used to stop the compiler from turning our function name into gibberish. - The
#![no_main]
attribute is used to tell the compiler to stop expecting amain
function. - The
efi_main
function takes two arguments: a handle and a system table, and returns a usize. - The
#![feature(feature_name)]
attribute is used to enable an unstable feature.
Your code should be looking like this now:
Directory view:
blasterball/
| .cargo/
| | config.toml
| src/
| | main.rs
| .gitignore
| Cargo.lock
| Cargo.toml
main.rs
contents
#![no_std]
#![no_main]
#![feature(abi_efiapi)]
#[no_mangle]
extern "efiapi" fn efi_main(handle: *const core::ffi::c_void, sys_table: *mut u8) -> usize {
0
}
#[panic_handler]
fn panic_handler(panic_info: &core::panic::PanicInfo) -> ! {
loop {}
}
In the Next Post
We will be printing hello world.
References
- https://en.wikipedia.org/wiki/Calling_convention
- https://github.com/tianocore/edk2/blob/master/OvmfPkg/README
- https://doc.rust-lang.org/unstable-book/index.html
- https://github.com/rust-lang/rust/issues/65815
- https://stackoverflow.com/questions/27454761/what-is-a-crate-attribute-and-where-do-i-add-it
- https://doc.rust-lang.org/reference/attributes.html
- https://en.wikipedia.org/wiki/Name_mangling
- https://wiki.osdev.org/OVMF
- https://wiki.osdev.org/QEMU