Demilade Sonuga's blog

All posts

Printing Hello World Again III

2022-11-25 · 61 min read

It's finally time to print "Hello World!" in our graphics mode.

Take a look at our FONT array in our font.rs file:

pub const FONT: [[[bool; 16]; 16]; 26] = [
A, B, C, D, E, F, G, H, I, J, K, L, M, N,
O, P, Q, R, S, T, U, V, W, X, Y, Z
];

You'll notice that something is missing. We have the descriptions to draw "HELLO" and "WORLD", but no " " and "!".

To draw those characters, we need to create descriptions for them.

Firstly, space is just, well, a space. It has no lines. In our font.rs, space will look like this:

0  - ________________
1  - ________________
2  - ________________
3  - ________________
4  - ________________
5  - ________________
6  - ________________
7  - ________________
8  - ________________
9  - ________________
10 - ________________
11 - ________________
12 - ________________
13 - ________________
14 - ________________
15 - ________________

In our Rust code:

// Description of a space
const SPACE: [[bool; 16]; 16] = [[false; 16]; 16];

As for our exclamation mark, it's just a straight line in the middle and a dot at the bottom:

0  - ________________
1  - ______1111______
2  - ______1111______
3  - ______1111______
4  - ______1111______
5  - ______1111______
6  - ______1111______
7  - ______1111______
8  - ______1111______
9  - ______1111______
10 - ________________
11 - ________________
12 - _____111111_____
13 - _____111111_____
14 - _____111111_____
15 - ________________

This is just one way of doing it. There are lots of other ways an exclamation mark could look. Try some of them out and have fun.

In font.rs, the description above will translate to:

// A description of an exclamation mark
const EXCLAMATION: [[bool; 16]; 16] = [
[false; 16],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false, false, false, false, false, false, true, true, true, true, false, false, false, false, false, false],
[false; 16],
[false; 16],
[false, false, false, false, false, true, true, true, true, true, true, false, false, false, false, false],
[false, false, false, false, false, true, true, true, true, true, true, false, false, false, false, false],
[false, false, false, false, false, true, true, true, true, true, true, false, false, false, false, false],
[false; 16]
];

Adding our new character descriptions to our FONT array:

/* DELETED:// A description of an exclamation mark
pub const FONT: [[[bool; 16]; 16]; 26] = [
A, B, C, D, E, F, G, H, I, J, K, L, M, N,
O, P, Q, R, S, T, U, V, W, X, Y, Z
];
*/

// NEW:
pub const FONT: [[[bool; 16]; 16]; 28] = [
A, B, C, D, E, F, G, H, I, J, K, L, M, N,
O, P, Q, R, S, T, U, V, W, X, Y, Z, SPACE, EXCLAMATION
];

Okay, now our font character set is complete. It seems like we're ready to print, but not quite. In our code, we'll need a print_str which we'll then call like so: print_str("Hello World!")

But how exactly is this print_str function going to work? We have the font descriptions of all the characters we want to print in an array. We have a pointer to the screen that we can use to draw. We have the screen width (number of pixels in a row) and the screen height (number of pixels in a column). Now, we need an algorithm, which when provided with this data, will draw the characters to the screen.

Stop for a moment and think about how you'll do this yourself.

Writing our print_str function

To print a string, we need to go over all the characters in the string, from left to right and print those characters one by one in the order we encounter them. From this information, we start our algorithm like this:

fn print_str(s)
    for each character c in s, from left to right
        print character c
        if there's still space on the row,
            move the screen's position to that space
        else
            move the screen's position to the next row
            if this next row is non existent, end the loop

But how exactly do we print a character?

From the previous post, we get an idea of how to do this. Printing a character is simply a matter of iterating (looping) over the character's descriptions and putting either a foreground-colored pixel or a background-colored pixel on the screen whenever the description says so.

So, a print_char function will look like this:

fn print_char(screen, font_description, curr_screen_pos)
    for i in 0..16
        for j in 0..16
            if font_description[i][j]
                screen.pixels[curr_screen_pos.row + i][curr_screen_pos.column + j] = yellow pixel
            else
                screen.pixels[curr_screen_pos.row + i][curr_screen_pos.column + j] = black pixel

Okay, this print_char procedure will print a character given the screen (screen), the current screen position (curr_screen_pos) and the character's font description (font_description) in the format we've specified all our font characters in our font.rs file.

But there's still a problem here. How do we go from the character c to its font description? To do this, we head back to our FONT array definition:

pub const FONT: [[[bool; 16]; 16]; 28] = [
A, B, C, D, E, F, G, H, I, J, K, L, M, N,
O, P, Q, R, S, T, U, V, W, X, Y, Z, SPACE, EXCLAMATION
];

Our letter 'A's description is at index 0. Our letter 'B's description is at index 1. Our letter 'C's description is at index 2. This increasing sequence of indexes goes on for 'D' all the way down to 'Z' which is at an index of 25. Our SPACE and EXCLAMATION come right after our alphabet, at indexes 26 and 27, respectively.

To map a character c to its correct font description, we need some kind of function, say char_to_font_index, which when given a character c, returns the index of the character's description in FONT array.

To conceive of such a function, we first need to take a look at the Unicode standard.

In the Unicode standard, letters and symbols are represented with numbers called code points. These numbers are guaranteed to be in the range 0..=1114111. A subset of these numbers: 55296..=54343 are called the surrogate code points and they are reserved for reasons that we don't need to know. Any number that is not a surrogate code point is a Unicode scalar value.

In Rust, all characters are represented as Unicode scalar values. So characters are just numbers interpreted as letters depending on what the Unicode standard says the numbers correspond to. For example, in the Unicode standard, the character 'A' has a code point of 65. Since 65 is not in the range of surrogate code points, it's a Unicode scalar value. And this number is what Rust actually deals with handles whenever the character 'A' is encountered in code.

In the Unicode standard, the letter 'B' has a code point of 66, 'C' has a code point of 67, 'D' has a code point of 68 and 'E'..='Z' have code points of 69..=90. Similarly, the small letters 'a'..='z' have code points of 97..=122.

With this information, how we'll implement our char_to_font_index is becoming clear. We take the code point of the character and map that number to the appropriate index in the array. But how exactly will the mapping take place? What operations, exactly, will we carry out on the Unicode code points that will transform them to the appropriate index into our FONT array?

To figure this out, let's take a look at some mappings:

Character   Unicode code point   Index into our FONT array
A        ->        65          ->           0 == 65 - 65
B        ->        66          ->           1 == 66 - 65
C        ->        67          ->           2 == 67 - 65
D        ->        68          ->           3 == 68 - 65
E        ->        69          ->           4 == 69 - 65
...
X        ->        88          ->           23 == 88 - 65
Y        ->        89          ->           24 == 89 - 65
Z        ->        90          ->           25 == 90 - 65

From the little data above, it's clear that the operation we need to perform to map a Unicode code point to our FONT array index is to subtract 65 from the code point (For characters 'A' - 'Z').

After taking a closer look at the FONT array

// NEW:
pub const FONT: [[[bool; 16]; 16]; 28] = [
A, B, C, D, E, F, G, H, I, J, K, L, M, N,
O, P, Q, R, S, T, U, V, W, X, Y, Z, SPACE, EXCLAMATION
];

and the way the Unicode standard defines its code points, it becomes clear why this is point - 65 is the transformation that has to take place.

The first character, 'A', has a code point of 65 and the index of its font description in the FONT array is 0. Character 'B' comes immediately after 'A' in the standard and has a code point of 65 + 1. It also comes immediately after 'A' in the FONT array and has an index of 1. The same goes for the other characters 'C' to 'Z'. The character's position in the alphabet is its index in the FONT array (assuming the characters in the alphabet are numbered from 0 to 25) and the characters are numbered in the Unicode standard at an offset of 65, since its first character, 'A', has a code point of 65. Subtracting this offset (65) from the code point will give the character's position in the alphabet, which is also its index into the FONT array.

With this information, we can now pin down a procedure char_to_font_index that will take a character and return the character's index into the FONT array:

fn char_to_font_index(c)
    return c as code point - 65

This function above will map any character in the range 'A'..='Z' to the correct index of its font description in the FONT array. But there's another problem here. We don't have font descriptions for small letters 'a'..='z' and we want to print "Hello World!" which definitely consists of small letters.

There are a few things we could do to resolve this. We could define font descriptions for the small letters 'a'..='z'. If you want to do this, please go ahead. Another thing we could do: whenever you see a small letter, draw the equivalent big letter. That is, whenever you see small 'a', print big 'A', and so on. This is the approach I'll be using.

In the Unicode standard, 'a' has a code point of 97, 'b' has a code point of '98', 'c' has a code point of 99, and 'd'..='z' have code points of 100..=122. Similarly to 'A'..='Z's code points, 'a'..='z's code points are at an offset of 97, since adding 97 to the position of a letter in the alphabet will give it the code point of the letter (For the small letters 'a'..='z').

65 + 32 == 97. So, subtracting 32 from a small letter code point will give the big letter code point. With this piece of information, we can modify our algorithm as follows:

fn char_to_font_index(c)
    if c as code point >= 97
        return char_to_font_index(c as code point - 32)
    else
        return c as code point - 65

This is just one way of handling this. There are other ways. Feel free to explore.

Now, our char_to_font_index can handle small and big English letters. But "Hello World!" still has space and exclamation mark characters. How do we handle those mappings?

Take a look at our font.rs again:

pub const FONT: [[[bool; 16]; 16]; 28] = [
A, B, C, D, E, F, G, H, I, J, K, L, M, N,
O, P, Q, R, S, T, U, V, W, X, Y, Z, SPACE, EXCLAMATION
];

We can see that SPACE and EXCLAMATION have indexes 26 and 27.

In the Unicode standard, the space character has a code point of 32 and the exclamation mark character has a code point of 33.

So, code point 32 -> index 26 and code point 33 -> index 27. Since these are just 2 cases, we can just create special cases for them in our algorithm:

fn char_to_font_index(c)
    if c as code point >= 97
        return char_to_font_index(c as code point - 32)
    else if c as code point == 32
        return 26
    else if c as code point == 33
        return 27
    else
        return c as code point - 65

And that's it for our char_to_font_index algorithm.

Our complete print_str procedure now looks like this:

fn print_str(screen, s)
    initialize screen_pos <- (0, 0)
    for each character c in s, from left to right
        print_char(screen, FONT[char_to_font_index(c)], screen_pos)
        screen_pos.column += 16
        if screen_pos.column >= NO_OF_PIXELS_IN_A_ROW
            screen_pos.row += 16
            screen_pos.column = 0
        if screen_pos.row >= NO_OF_PIXELS_IN_A_COLUMN
            break

fn print_char(screen, font_description, curr_screen_pos)
    for i in 0..16
        for j in 0..16
            if font_description[i][j]
                screen.pixels[curr_screen_pos.row + i][curr_screen_pos.column + j] = yellow pixel
            else
                screen.pixels[curr_screen_pos.row + i][curr_screen_pos.column + j] = black pixel

fn char_to_font_index(c)
    if c as code point >= 97
        return char_to_font_index(c as code point - 32)
    else if c as code point == 32
        return 26
    else if c as code point == 33
        return 27
    else
        return c as code point - 65

Translating this to Rust:

// Prints the string s on screen. s may only consist of big and small
// English characters and "!" and " "
fn print_str(screen: *mut Screen, s: &str) {
// The initial screen position (row, column)
let mut screen_pos = (0, 0);
// Iterating over the character's code points
// We don't need to worry about them being bigger than
// u8::MAX because the only characters we're considering are 'A'..='Z', 'a'..='z' and
// "!" and " "
for c in s.as_bytes() {
print_char(screen, &FONT[char_to_font_index(*c)], screen_pos);
// Advance the screen position to the next position on the row
screen_pos.1 += 16;
// If there is no more space on the row
if screen_pos.1 >= NO_OF_PIXELS_IN_A_ROW {
// Advance to the next row
screen_pos.0 += 16;
// Start from the first space on this new row
screen_pos.1 = 0;
}
// If there are no more rows, stop looping
if screen_pos.0 >= NO_OF_PIXELS_IN_A_COLUMN {
break;
}
}
}

// Print the character described by font_description to the screen at position curr_screen_pos
fn print_char(screen: *mut Screen, font_description: &[[bool; 16]; 16], curr_screen_pos: (usize, usize)) {
for i in 0..16 {
for j in 0..16 {
if font_description[i][j] {
// Red and green is yellow (which we're using as our foreground color here)
unsafe {
(*screen).pixels[curr_screen_pos.0 + i][curr_screen_pos.1 + j] = Pixel {
red: 255,
green: 255,
blue: 0,
reserved: 0
};
}
} else {
// All 0s is black (which we're using as our background here)
unsafe {
(*screen).pixels[curr_screen_pos.0 + i][curr_screen_pos.1 + j] = Pixel {
red: 0,
green: 0,
blue: 0,
reserved: 0
};
}
}
}
}
}

// Takes the Unicode code point of a character in 'A'..='Z' or 'a'..='z' or "!" or " "
// and returns its index into the FONT array
fn char_to_font_index(c: u8) -> usize {
if c >= 97 {
// Small letters to big letters
char_to_font_index(c - 32)
} else if c == 32 {
// Space
26
} else if c == 33 {
// Exclamaion mark
27
} else {
// FONT array index for big letters
(c - 65) as usize
}
}

And finally, we can print "Hello World!".

The translation to Rust above isn't exactly as the algorithm described. The string slice's (&str) as_bytes returns an iterator over references to the raw bytes of the string slice. So in the loop, c is actually a reference to a byte (u8). Because we're only considering uppercase and lowercase English letters and " " and "!", we can be sure that their code points can always fit in one byte. So, we can be sure that c is always a reference to a valid code point.

When c is passed to char_to_font_index, it is first dereferenced (*c) because it is a reference to a u8 and not a u8 itself. Our Rust char_to_font_index, takes the code point of the character, rather than the character itself.

Open up your main.rs file and throw the above functions into it (or whichever print_str function you wrote for this task).

Now modify your efi_main:

#[no_mangle]
extern "efiapi" fn efi_main(handle: *const core::ffi::c_void, sys_table: *mut SystemTable) -> usize {
let boot_services = unsafe { (*sys_table).boot_services };
let gop_guid = Guid {
first_chunk: 0x9042a9de,
second_chunk: 0x23dc,
third_chunk: 0x4a38,
other_chunks: [0x96, 0xfb, 0x7a, 0xde, 0xd0, 0x80, 0x51, 0x6a]
};
let mut gop: *mut core::ffi::c_void = core::ptr::null_mut();
let guid_ptr = &gop_guid as *const Guid;
let registration = core::ptr::null_mut();
let gop_ptr = &mut gop as *mut _;
let locate_gop_status = unsafe { ((*boot_services).locate_protocol)(
guid_ptr,
registration,
gop_ptr
) };

if locate_gop_status != 0 {
let mut string_u16 = [0u16; 22];
let string = "Failed to locate GOP\n";
string.encode_utf16()
.enumerate()
.for_each(|(i, letter)| string_u16[i] = letter);
let simple_text_output = unsafe { (*sys_table).simple_text_output };
unsafe { ((*simple_text_output).output_string)(simple_text_output, string_u16.as_mut_ptr()); }
loop {}
}

let gop = gop as *mut GraphicsOutput;
let mode = unsafe { (*gop).mode };
let max_mode = unsafe { (*mode).max_mode };
let mut desired_mode = 0;
for mode_number in 0..max_mode {
let size_of_info = core::mem::size_of::<GraphicsModeInfo>();
let mut mode: *const GraphicsModeInfo = core::ptr::null_mut();
let query_mode = unsafe { (*gop).query_mode };
let query_status = (query_mode)(
gop,
mode_number,
&size_of_info as *const _,
&mut mode as *mut _
);
if query_status != 0 {
let mut string_u16 = [0u16; 19];
let string = "query_mode failed\n";
string.encode_utf16()
.enumerate()
.for_each(|(i, letter)| string_u16[i] = letter);
let simple_text_output = unsafe { (*sys_table).simple_text_output };
unsafe { ((*simple_text_output).output_string)(simple_text_output, string_u16.as_mut_ptr()); }
loop {}
}

let horizontal_resolution = unsafe { (*mode).horizontal_resolution };
let vertical_resolution = unsafe { (*mode).vertical_resolution };
let pixel_format = unsafe { (*mode).pixel_format };
if horizontal_resolution == 640 && vertical_resolution == 480
&& pixel_format == PixelFormat::BlueGreenRedReserved {
desired_mode = mode_number;
break;
}
if mode_number == max_mode - 1 {
let mut string_u16 = [0u16; 32];
let string = "Couldn't find the desired mode\n";
string.encode_utf16()
.enumerate()
.for_each(|(i, letter)| string_u16[i] = letter);
let simple_text_output = unsafe { (*sys_table).simple_text_output };
unsafe { ((*simple_text_output).output_string)(simple_text_output, string_u16.as_mut_ptr()); }
loop {}
}
}

let set_mode_status = unsafe { ((*gop).set_mode)(
gop,
desired_mode
) };

if set_mode_status != 0 {
let mut string_u16 = [0u16; 32];
let string = "Failed to set the desired mode\n";
string.encode_utf16()
.enumerate()
.for_each(|(i, letter)| string_u16[i] = letter);
let simple_text_output = unsafe { (*sys_table).simple_text_output };
unsafe { ((*simple_text_output).output_string)(simple_text_output, string_u16.as_mut_ptr()); }
loop {}
}

let framebuffer_base = unsafe { (*mode).framebuffer_base };

let screen = framebuffer_base as *mut Screen;

/* DELETED:
let mut curr_row = 0;
let mut curr_col = 0;
for font_char in 0..FONT.len() {
for i in 0..16 {
for j in 0..16 {
if FONT[font_char][i][j] {
screen.pixels[curr_row + i][curr_col + j] = Pixel {
red: 255,
green: 255,
blue: 0,
reserved: 0
}
} else {
screen.pixels[curr_row + i][curr_col + j] = Pixel {
red: 0,
green: 0,
blue: 0,
reserved: 0
}
}
}
}
curr_col += 16;
if curr_col >= NO_OF_PIXELS_IN_A_ROW {
curr_row = curr_row + 16;
curr_col = 0;
}
if curr_row >= NO_OF_PIXELS_IN_A_COLUMN {
break;
}
}
*/


// Printing "Hello World!"
print_str(screen, "Hello World!");

0
}

Upon building and running:

Hello World Again

Take Away

  • Rust characters are Unicode scalar values

Code till now:

Directory view:

blasterball/
| .cargo/
| | config.toml
| src/
| | font.rs
| | main.rs
| .gitignore
| Cargo.lock
| Cargo.toml

font.rs contents

// ... Your font descriptions for letters 'A' to 'Z' and ' ' and '!'

pub const FONT: [[[bool; 16]; 16]; 28] = [
A, B, C, D, E, F, G, H, I, J, K, L, M, N
O, P, Q, R, S, T, U, V, W, X, Y, Z, SPACE, EXCLAMATION
];

main.rs contents

// ... Before efi_main

#[no_mangle]
extern "efiapi" fn efi_main(handle: *const core::ffi::c_void, sys_table: *mut SystemTable) -> usize {
let boot_services = unsafe { (*sys_table).boot_services };
let gop_guid = Guid {
first_chunk: 0x9042a9de,
second_chunk: 0x23dc,
third_chunk: 0x4a38,
other_chunks: [0x96, 0xfb, 0x7a, 0xde, 0xd0, 0x80, 0x51, 0x6a]
};
let mut gop: *mut core::ffi::c_void = core::ptr::null_mut();
let guid_ptr = &gop_guid as *const Guid;
let registration = core::ptr::null_mut();
let gop_ptr = &mut gop as *mut _;
let locate_gop_status = unsafe { ((*boot_services).locate_protocol)(
guid_ptr,
registration,
gop_ptr
) };

if locate_gop_status != 0 {
let mut string_u16 = [0u16; 22];
let string = "Failed to locate GOP\n";
string.encode_utf16()
.enumerate()
.for_each(|(i, letter)| string_u16[i] = letter);
let simple_text_output = unsafe { (*sys_table).simple_text_output };
unsafe { ((*simple_text_output).output_string)(simple_text_output, string_u16.as_mut_ptr()); }
loop {}
}

let gop = gop as *mut GraphicsOutput;
let mode = unsafe { (*gop).mode };
let max_mode = unsafe { (*mode).max_mode };
let mut desired_mode = 0;
for mode_number in 0..max_mode {
let size_of_info = core::mem::size_of::<GraphicsModeInfo>();
let mut mode: *const GraphicsModeInfo = core::ptr::null_mut();
let query_mode = unsafe { (*gop).query_mode };
let query_status = (query_mode)(
gop,
mode_number,
&size_of_info as *const _,
&mut mode as *mut _
);
if query_status != 0 {
let mut string_u16 = [0u16; 19];
let string = "query_mode failed\n";
string.encode_utf16()
.enumerate()
.for_each(|(i, letter)| string_u16[i] = letter);
let simple_text_output = unsafe { (*sys_table).simple_text_output };
unsafe { ((*simple_text_output).output_string)(simple_text_output, string_u16.as_mut_ptr()); }
loop {}
}

let horizontal_resolution = unsafe { (*mode).horizontal_resolution };
let vertical_resolution = unsafe { (*mode).vertical_resolution };
let pixel_format = unsafe { , rather than the (*mode).pixel_format };
if horizontal_resolution == 640 && vertical_resolution == 480
&& pixel_format == PixelFormat::BlueGreenRedReserved {
desired_mode = mode_number;
break;
}
if mode_number == max_mode - 1 {
let mut string_u16 = [0u16; 32];
let string = "Couldn't find the desired mode\n";
string.encode_utf16()
.enumerate()
.for_each(|(i, letter)| string_u16[i] = letter);
let simple_text_output = unsafe { (*sys_table).simple_text_output };
unsafe { ((*simple_text_output).output_string)(simple_text_output, string_u16.as_mut_ptr()); }
loop {}
}
}

let set_mode_status = unsafe { ((*gop).set_mode)(
gop,
desired_mode
) };

if set_mode_status != 0 {
let mut string_u16 = [0u16; 32];
let string = "Failed to set the desired mode\n";
string.encode_utf16()
.enumerate()
.for_each(|(i, letter)| string_u16[i] = letter);
let simple_text_output = unsafe { (*sys_table).simple_text_output };
unsafe { ((*simple_text_output).output_string)(simple_text_output, string_u16.as_mut_ptr()); }
loop {}
}

let framebuffer_base = unsafe { (*mode).framebuffer_base };

let screen = framebuffer_base as *mut Screen;

print_str(screen, "Hello World!");

0
}

fn print_str(screen: *mut Screen, s: &str) {
let mut screen_pos = (0, 0);
for c in s.as_bytes() {
print_char(screen, &FONT[char_to_font_index(*c)], screen_pos);
screen_pos.1 += 16;
if screen_pos.1 >= NO_OF_PIXELS_IN_A_ROW {
screen_pos.0 += 16;
screen_pos.1 = 0;
}
if screen_pos.0 >= NO_OF_PIXELS_IN_A_COLUMN {
break;
}
}
}

fn print_char(screen: *mut Screen, font_description: &[[bool; 16]; 16], curr_screen_pos: (usize, usize)) {
for i in 0..16 {
for j in 0..16 {
if font_description[i][j] {
unsafe {
(*screen).pixels[curr_screen_pos.0 + i][curr_screen_pos.1 + j] = Pixel {
red: 255,
green: 255,
blue: 0,
reserved: 0
};
}
} else {
unsafe {
(*screen).pixels[curr_screen_pos.0 + i][curr_screen_pos.1 + j] = Pixel {
red: 0,
green: 0,
blue: 0,
reserved: 0
};
}
}
}
}
}

fn char_to_font_index(c: u8) -> usize {
if c >= 97 {
char_to_font_index(c - 32)
} else if c == 32 {
26
} else if c == 33 {
27
} else {
(c - 65) as usize
}
}

// ... Others

In the Next Post

We'll start fixing up our messy code.

References