Demilade Sonuga's blog

All posts

The Global Descriptor Table IV

2023-02-01

The last thing we need to model is the segment selector. Model it yourself now.

Modeling The Segment Selector

Segment selectors are just 16-bit values that describe the location of a descriptor in the GDT, or in more simplified terms, the index in the GDT.

In gdt.rs:

// Index into a GDT
#[derive(Clone, Copy)]
#[repr(transparent)]
struct SegmentSelector(u16);

This is a simple wrapper around u16, that is, this is just a named u16 that will have operations specific to segment selectors. The #[repr(transparent)] attribute is necessary because the segment selectors must be represented as ordinary u16s and the attribute ensures that. This is because the Intel manuals say so and the compiler shouldn't just be allowed to do anything it wants.

We implement Clone and Copy because there is no reason why we shouldn't. It's just a u16 that should be able to be freely copied.

Bit 0..=1 of the segment selector is the privilege level and we're in the highest privilege level: 0. Bit 2 tells which descriptor table should be used: 0, for GDT which is what we want. From this info, we can be sure that the lower 3 bits of a segment selector should always be 0.

The only thing we need to think about is the index in bits 3..=15. We need to remember that this index will be multiplied by 8 to give a byte offset, the number of bytes between the start of the GDT and the descriptor, so the index that will be placed in here will be the index of the descriptor in the GDT's descriptors array. This works because the descriptors array is an array of 8-byte values, so if we have a GDT that looks like this: [null descriptor, system descriptor, non-system descriptor], then the index of the non-system descriptor will correctly be 3 because the system descriptor takes up 2 positions in the array. 3 * 8 == 24, which is the correct byte offset of the non-system descriptor.

To create a new SegmentSelector from an index, we can add a constructor:

impl SegmentSelector {
    // Create a segment selector from an index
    fn new(index: u16) -> Self {
        Self(index << 3)
    }
}

The bitwise operation left shift, <<, is just like the right shift operation but instead of shifting bits to the right, they are shifted to the left. Shifting the index to the left 3 times produces a new 16-bit value with the index in bits 3..=15 and the lower 3 bits, bits 0..=2, as 0.

At this point, we essentially have all we need to create a working GDT. We need to move on to setting it up.

Setting Up The GDT

By "setting up the GDT", what is actually meant is "create GDT and tell CPU where it is". Creating the GDT is as simple as creating any new struct instances in Rust.

Before we do that, we first have to make GDT and all the other stuff we know we're going to be using directly public:

In gdt.rs

#[repr(C, align(8))]
// DELETED: struct GDT { /* ... Others */ }
pub struct GDT { /* ... Others */ } // NEW

impl GDT {
    // DELETED: fn new() -> Self {
    pub fn new() -> Self { // NEW
        /* ...Others */
    }

    // DELETED: fn add_descriptor(&mut self, descriptor: Descriptor) -> Result<(), &'static str> {
    pub fn add_descriptor(&mut self, descriptor: Descriptor) -> Result<(), &'static str> { // NEW
        /* ... Others */
    }
}

// DELETED: enum Descriptor { /* ...Others */ }
pub enum Descriptor { /* ...Others */ } // NEW

#[derive(Clone, Copy)]
#[repr(transparent)]
// DELETED: struct SegmentSelector(u16);
pub struct SegmentSelector(u16); // NEW

And just for convenience, to create the code and data segments:

impl Descriptor {
    // ...Others
    const CODE_SEGMENT: u64 = Self::SHARED | Self::EXECUTABLE | Self::IS_CODE;
    const DATA_SEGMENT: u64 = Self::SHARED;

    // NEW:
    pub fn code_segment() -> Self {
        Self::NonSystem(Self::CODE_SEGMENT)
    }

    pub fn data_segment() -> Self {
        Self::NonSystem(Self::DATA_SEGMENT)
    }
}

In main.rs

// ...Others
mod bitmap;
use bitmap::{FileHeader, DIBHeader, ColorTable, Color, Bitmap, draw_bitmap, erase_bitmap};
use core::fmt::Write;
// NEW:
mod gdt;
use gdt::{Descriptor, GDT};

#[no_mangle]
extern "efiapi" fn efi_main(
    handle: *const core::ffi::c_void,
    sys_table: *mut SystemTable,
) -> usize {
    // ...Others
    let screen = init_graphics_result.unwrap();
    
    init_screen(screen);
    
    let screen = get_screen().unwrap();

    boot_services.exit_boot_services(handle).unwrap();

    // NEW:
    // Creating a new GDT
    let mut gdt = GDT::new();
    gdt.add_descriptor(Descriptor::code_segment());
    gdt.add_descriptor(Descriptor::data_segment());

    game::blasterball(screen);

    // ...Others
}

Now, we've been able to create a GDT structure, but the CPU doesn't know where it is. To tell the CPU of the GDT's location, we need to do 3 things:

  1. Disable interrupts.
  2. Loading a structure that I will call the descriptor table pointer into a special register.
  3. Re-enabling interrupts.

Disabling Interrupts

As mentioned in an earlier post, apart from memory units, another thing computers use to store data is registers which are very high-speed tiny memory in the processor itself. The Intel x86 architecture has a number of special registers in the CPU that store, not just any information, but rather, important information that controls some aspect of the processor's operations.

One of these special registers is the flags register (EFLAGS). The flags register, on x86_64, holds a 64-bit value whose bits have special meaning, like the other structures we've been looking at. One of these bits tells the CPU whether or not it should respond to interrupts.

Our first step to loading the GDT is setting this interrupt bit in the flags register to 0, telling the CPU that it should not respond to interrupts, that is, disable interrupts. This is because handling interrupts require a working GDT to be set up, which is the whole point of us going through this GDT setup process.

To disable interrupts, we need to make use of a special Intel assembly instruction cli. When the CPU executes this cli instruction, the interrupt bit in the flags register is set to 0 and the CPU stops responding to interrupts. You can read cli as "clear interrupt".

To reflect this in code, create a new file: interrupts.rs and throw this in:

use core::arch::asm;

// Tells the processor to stop responding to interrupts
pub fn disable_interrupts() {
    unsafe {
        asm!("cli");
    }
}

This function makes use of the asm macro which is used to express assembly instructions that can't be expressed in Rust or are just faster than what the Rust compiler would generate.

In main.rs

mod gdt;
use gdt::{Descriptor, GDT};
// NEW:
mod interrupts;

#[no_mangle]
extern "efiapi" fn efi_main(
    handle: *const core::ffi::c_void,
    sys_table: *mut SystemTable,
) -> usize {
    // ...Others

    let mut gdt = GDT::new();
    gdt.add_descriptor(Descriptor::code_segment());
    gdt.add_descriptor(Descriptor::data_segment());
    // NEW:
    interrupts::disable_interrupts();

    game::blasterball(screen);

    // ...Others
}

The Descriptor Table Pointer

To tell the processor where the GDT is, we need to create an 80-bit value:

BitsMeaning
0..=15Size of table - 1 (limit)
16..=79Address of the table (base)

The first 16 bits of the table is the limit, the size of the table - 1.

The next 64 bits, the base of the table, is the table's address (just for your info, the address of the table (or any structure) is the address of the first byte) (a 64-bit value, because x86_64 is a 64-bit processor).

We translate this to Rust as in:

// Tells the processor where a descriptor table is located
#[repr(C, packed)]
pub struct DescriptorTablePointer {
    // Size of the descriptor table - 1
    limit: u16,
    // The starting address of the descriptor table
    base: u64
}

Adding the packed to the #[repr(C)] attribute ensures that no extra padding is added anywhere in the struct.

Throw it in gdt.rs.

This structure is what the processor uses to find the GDT but how does it find this structure? The answer is a special register called the GTDR (Global Descriptor Table Register). This register holds the address of the descriptor table pointer. So, the process of the processor finding the GDT goes like this: GDTR -> descriptor table pointer -> GDT.

Putting the address of the descriptor table pointer in this register requires the use of an Intel assembly instruction: lgdt. It takes a single argument which is the address of the descriptor table pointer.

To get all this down into code, first, create a GDT associated function as_pointer to create the descriptor table pointer for the GDT:

// NEW:
use core::mem;

const MAX_NO_OF_ENTRIES: usize = 8;

impl GDT {
    pub fn as_pointer(&self) -> DescriptorTablePointer {
        DescriptorTablePointer {
            base: self as *const _ as u64,
            limit: (mem::size_of::<Self>() - 1) as u16
        }
    }
}

self is a reference to the current instance, so it is the address of the instance. core::mem's size_of function returns the size of a type in bytes.

The function to actually load the GDT:

use core::mem;
use core::arch::asm; // NEW

// ...Others

impl GDT {
    // Loads the GDT in the GDTR register
    pub fn load(&self, pointer: &DescriptorTablePointer) {
        unsafe {
            asm!("lgdt [{}]", in(reg) pointer);
        }
    }
}

The in(reg) tells the compiler that it should put the value in pointer into any available register. The "{}" in the format string acts just like the one in a println!. It takes the value of the argument after the format string which, in this case, is the register the compiler chooses to put pointer into.

In main.rs

#[no_mangle]
extern "efiapi" fn efi_main(
    handle: *const core::ffi::c_void,
    sys_table: *mut SystemTable,
) -> usize {
    // ...Others

    let mut gdt = GDT::new();
    gdt.add_descriptor(Descriptor::code_segment());
    gdt.add_descriptor(Descriptor::data_segment());
    interrupts::disable_interrupts();
    // NEW:
    let gdt_pointer = gdt.as_pointer();
    gdt.load(&gdt_pointer);

    game::blasterball(screen);

    // ...Others
}

Enabling Interrupts

As the last step in telling the processor where the GDT is, we need to re-enable interrupts. Similarly to the "clear interrupts" cli instruction, there is a "set interrupts" sti instruction.

In interrupts.rs

// Tells the processor to start responding to interrupts
pub fn enable_interrupts() {
    unsafe {
        asm!("sti");
    }
}

In main.rs

#[no_mangle]
extern "efiapi" fn efi_main(
    handle: *const core::ffi::c_void,
    sys_table: *mut SystemTable,
) -> usize {
    // ...Others

    let mut gdt = GDT::new();
    gdt.add_descriptor(Descriptor::code_segment());
    gdt.add_descriptor(Descriptor::data_segment());
    interrupts::disable_interrupts();
    let gdt_pointer = gdt.as_pointer();
    gdt.load(&gdt_pointer);
    // NEW:
    interrupts::enable_interrupts();

    // ...Others
}

The Rest of The Story

It seems like we're done with GDT but we aren't. Remember that there are segment selectors that tell where in the table a segment is located. The question that comes now is why is there such a structure since it seems like they aren't needed.

Well, the answer is that the CPU still needs to know which segments in the GDT we're using as our data segment and which ones we're using for the code segment. This is the reason for segment selectors, to give the CPU this information about where in a table the segments are located.

To give the CPU info on which segments we're using for what, we need to use special registers called segment registers. Segment registers are 16-bit registers that hold segment selectors.

There are 6 segment registers in total: CS, DS, SS, ES, FS, and GS. The only ones we'll be dealing with are the first 3. CS holds the segment selector for the code segment, DS holds the segment selector for the data segment, and SS is for the stack segment. As mentioned before, all this segment stuff is not relevant to us in any way, but we still need to set it up to get on with interrupts.

The stack segment was supposed to be used as the segment of memory to hold the stack, but we'll load it with the data segment, because we don't actually need a stack segment, but a valid selector has to be there.

Modeling The Segment Registers

We can model the segment registers as unit structs (structs with no data) and implement a trait for all of them that indicates that they are segment registers (just one way of doing this, there are other ways).

In gdt.rs

pub trait SegmentRegister {
    fn set(&self, selector: SegmentSelector);
}

// The data segment register
pub struct DS;

// The stack segment register`
pub struct SS;

// The code segment register
pub struct CS;

Values can't be moved directly into segment registers. They first have to be moved into another 16-bit register, then moved from that 16-bit register into the segment register. Apart from all these special registers used for controlling processor behavior, x86_64 also has a bunch of general-purpose registers that we can use for this purpose. One of these is rax, a 64-bit register. To access the lower 16 bits of this rax register, you refer to it as ax. So, the procedure for moving a segment selector into a segment register looks like this: ax <- selector => segment register <- ax.

impl SegmentRegister for DS {
    fn set(&self, selector: SegmentSelector) {
        unsafe {
            asm!("mov ds, ax", in("ax") selector.0);
        }
    }
}

impl SegmentRegister for SS {
    fn set(&self, selector: SegmentSelector) {
        unsafe {
            asm!("mov ss, ax", in("ax") selector.0);
        }
    }
}

The in("ax") tells the compiler that selector.0 should be copied into the register ax before adding the main instructions.

Setting the code segment register's value is a bit different. It requires doing something called a far return. What exactly this far return thing is is something we don't really need to worry about. It's just another one of that Intel assembly stuff that we just need to touch to get on with the project.

impl SegmentRegister for CS {
    fn set(&self, selector: SegmentSelector) {
        unsafe {
            asm!(
                "push {sel:r}",
                "lea {tmp}, [1f + rip]",
                "push {tmp}",
                "retfq",
                "1:",
                sel = in(reg) selector.0,
                tmp = lateout(reg) _
            );
        }
    }
}

I recommend you just copy and paste that. If you really want to understand far returns, check the references. I may (or may not) also explain it in a future post. I just don't think it's relevant to understanding the whole project.

Loading The Segment Registers

All that's left for us now is to load the segment registers with their appropriate selectors.

In main.rs

mod gdt;
// DELETED: use gdt::{Descriptor, GDT};
use gdt::{Descriptor, GDT, SegmentRegister}; // NEW

#[no_mangle]
extern "efiapi" fn efi_main(
    handle: *const core::ffi::c_void,
    sys_table: *mut SystemTable,
) -> usize {
    // ...Others

    let mut gdt = GDT::new();
    gdt.add_descriptor(Descriptor::code_segment());
    gdt.add_descriptor(Descriptor::data_segment());
    interrupts::disable_interrupts();
    let gdt_pointer = gdt.as_pointer();
    gdt.load(&gdt_pointer);
    interrupts::enable_interrupts();
    // NEW
    gdt::CS.set(?);
    gdt::DS.set(?);
    gdt::SS.set(?);

    // ...Others
}

Now, we have a problem. How do we get the segment selector for a segment? We could compute the index ourselves from the segments we've added, but that's error-prone. A much better way to do this will be to get the selector the moment it is added to the GDT.

In gdt.rs

impl GDT {
    // DELETED: pub fn add_descriptor(&mut self, descriptor: Descriptor) -> Result<(), &'static str> {
    pub fn add_descriptor(&mut self, descriptor: Descriptor) -> Result<SegmentSelector, &'static str> { // NEW
        match descriptor {
            Descriptor::NonSystem(value) => {
                // Is array full?
                if self.next_index >= self.descriptors.len() {
                    return Err("no enough space for descriptor");
                }
                self.descriptors[self.next_index] = value;
                self.next_index += 1;
                // DELETED: Ok(())
                Ok(SegmentSelector::new(self.next_index as u16 - 1)) // NEW
            }
            Descriptor::System(higher, lower) => {
                // Is there enough space for a system descriptor?
                if self.next_index + 1 >= self.descriptors.len() {
                    return Err("No enough space for descriptor");
                }
                self.descriptors[self.next_index] = lower;
                self.descriptors[self.next_index + 1] = higher;
                self.next_index += 2;
                // DELETED: Ok(())
                Ok(SegmentSelector::new(self.next_index as u16 - 2)) // NEW
            }
        }
    }
}

Now, once a new segment is added with the add_descriptor function, its selector is returned. We can now modify main.rs:

#[no_mangle]
extern "efiapi" fn efi_main(
    handle: *const core::ffi::c_void,
    sys_table: *mut SystemTable,
) -> usize {
    // ...Others

    let mut gdt = GDT::new();
    // DELETED: gdt.add_descriptor(Descriptor::code_segment());
    let cs = gdt.add_descriptor(Descriptor::code_segment()).unwrap(); // NEW
    // DELETED: gdt.add_descriptor(Descriptor::data_segment());
    let ds = gdt.add_descriptor(Descriptor::data_segment()).unwrap(); // NEW
    interrupts::disable_interrupts();
    let gdt_pointer = gdt.as_pointer();
    gdt.load(&gdt_pointer);
    interrupts::enable_interrupts();
    // NEW
    gdt::CS.set(cs);
    gdt::DS.set(ds);
    gdt::SS.set(ds);

    // ...Others
}

And that ends our GDT journey, for now.

Take Away

  • Registers are small high-speed memories in the processor.
  • Some registers can be used for anything and some have special purposes that control processor behavior.
  • Segment registers are what the processor uses to know which segment in the table should be used for what purpose.

For the full code, go to the repo

In The Next Post

We'll do a little refactoring.

References

  • https://wiki.osdev.org/GDT_Tutorial
  • Intel SDM, Volume 1, Appendix A: EFLAGS Cross-Reference
  • The Global Descriptor Table Register (GDTR): Intel SDM, Volume 3, section 2.4.1
  • https://doc.rust-lang.org/nomicon/other-reprs.html
  • Far Return: https://c9x.me/x86/html/file_module_x86_id_280.html

You can download the Intel Architecture Software Developer Manual (SDM) here