Demilade Sonuga's blog

All posts

The Global Descriptor Table III

2023-01-29 · 25 min read

If you came up with the descriptors for the code and data segments, I'm sure that they would be just long strings of digits the meaning of which is not immediately apparent. To remedy that, let's take a quick look at bitwise operations.

Bitwise Operations

Most computers represent numbers in their binary forms internally. For example, your computer's representation of the 8-bit number 3 will be electrical signals symbolized by 0b00000011, where 0s symbolize "low voltage" and 1 symbolizes "high voltage". The binary number 0b11 represents 3 because 11 is the binary equivalent of 3. If you're not familiar with this binary number conversion stuff, just Google it.

As you already know, there are several different operators which perform some operation on a number and produce a new number, like some kind of machine. Look at the addition operator (+): this takes 2 numbers, performs the mathematical operation of addition on them and outputs another number which is the result of that addition. Look at the equality operator (==): this takes 2 numbers (or anything else that can be compared), performs the logical operation of comparison and outputs a boolean value which is the result of that comparison.

Bitwise operators, on the other hand, take a number or two, perform operations on the numbers' bits (binary digits of its representation) and output a new number that is the result of those bit movements. Arithmetic and logical operations are purely mathematical, existing only in the mind (I mean, how do we know that 2 + 2 = 4 and not 5?) but bitwise operations can be viewed as mechanical. When we perform bitwise operations, we are literally moving the bits of a number.

The only 2 bitwise operations we'll be looking at now are the left-shift and the bitwise OR operators (because those are the only operators we'll be using for now).

The left-shift operator (<<) is a bitwise operator that takes two numbers a and b as input like so: a << b, shifts all of a's bits b times to the left and outputs the number whose bit representation corresponds to the result. For example, consider the 8-bit number 0b0000001, which is the number 1. The operation 1 << 3 shifts the bits one position to the left 3 times like so:

Shift Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
Original number 0 0 0 0 0 0 0 1
First Shift 0 0 0 0 0 0 1 0
Second Shift 0 0 0 0 0 1 0 0
Third Shift 0 0 0 0 1 0 0 0

After the first shift, bit 6 becomes bit 7, bit 5 becomes bit 6, bit 4 becomes bit 5, bit 3 becomes bit 4, bit 2 becomes bit 3, bit 1 becomes bit 2 and bit 0 becomes bit 1. Because there's nothing to replace bit 0, it defaults to 0 and because there is nowhere for bit 7 to be moved to (only space for 8 bits), it's no longer part of the number, like it's just thrown away.

This shifting process is repeated the number of times specified by the right-hand side of the << operator. From the operation above, the result of left shifting the number 1's bits 3 times is 0b00001000 and this is the binary equivalent of the number 8, so 1 << 3 == 8.

As for bitwise OR (|), this operator takes two numbers a and b as input like this: a | b, does a logical OR operation on bits whose positions correspond and outputs the number whose bit representation corresponds to the result.

The logical OR operation on binary digits is defined like this:

First Input Second Digit Output
0 0 0
1 0 1
0 1 1
1 1 1

The operation takes two bits and determines the output based on this table. As you can see, the output is 0 only if the two inputs are 0 and it's one if any of the bits are one. The idea comes from the fact that you can view a binary digit as a boolean where 1 represents true and 0 represents false. If you do this 1-true and 0-false substitution in this table, the reason for calling this the OR operation will make a lot of sense.

0 or 0 can be viewed as false or false which will give a false (0).

1 or 0 -> true or false -> true -> 1.

0 or 1 -> false or true -> true -> 1.

1 or 1 -> true or true -> true -> 1.

But this OR operation is for only single bits 1 and 0. For multiple bits, this operation can be extended as a bitwise operation in which the same single-bit OR will be applied to the bits of the input numbers whose positions correspond. For example, the bitwise OR of 1 | 2 == 0b00000001 | 0b00000010 is:

Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
First Number 0 0 0 0 0 0 0 1
Second Number 0 0 0 0 0 0 1 0
Output 0 0 0 0 0 0 1 1

In the bitwise OR, the single-digit OR is applied to bit 0 of the 2 input numbers to produce bit 0 of the output. It's applied again to bit 1 of the input numbers to produce bit 1 of the output, and so on. These single-digit OR operations continue with bits in corresponding positions of the two numbers until all bits have been ORed to produce the output.

This bitwise operator is going to be very useful to us because we can use it to set bits to 1 at positions where we want them to be 1. For example, suppose that you, for some reason, need an 8-bit number whose bit 0, bit 1, bit 4, and bit 7 must be 1 and all other bits 0. In such a scenario, bit shifting and bitwise ORing will come in very handy.

Firstly, we realize that 1 << n will result in a number whose bit n is 1 and all other bits 0. 1 << 3 => Original: 0b1 => First Shift: 0b10 => Second Shift: 0b100 => Third Shift: 0b1000, and it ends with bit 3 becoming a 1.

1 << 0 => Original: 0b1 => Left Shifted 0 times and it ends with bit 0 remaining a 1. 1 << 7 == 0b10000000, whose bit 7 is 1.

A bitwise OR takes numbers and outputs a number whose bit n will be set (== 1) only if any of the input numbers have bit n == 1. For example, 0b00000001 | 0b01110000 == 0b01110001. Bits 0, 4, 5, and 6 are set because at least one of the input numbers have those bits set.

Back to our needed 8-bit number whose bits 0, 1, 4, and 7 must be set (== 1). From the previous information, we can pin this down in Rust like so:

fn bit_stuff() {
let bit_0: u8 = 1 << 0;
let bit_1: u8 = 1 << 1;
let bit_4: u8 = 1 << 4;
let bit_7: u8 = 1 << 7;
let needed_8bit_number = bit_0 | bit_1 | bit_4 | bit_7;
}

That is, we just simply create four numbers that have bits 0, 1, 4 and 7 set, then bitwise or them together. The result is a single number whose bits 0, 1, 4 and 7 are set: our needed 8-bit number.

The Code and Data Segment Descriptors

The values of our descriptors are completely defined by bits that are supposed to be 1 at some positions and bits that are supposed to be 0 at other positions. For a descriptor to be valid, bit 47, the last bit of the access byte must be 1. We can indicate this in the code like so:

In gdt.rs

impl Descriptor {
const VALID: u64 = 1 << 47;
}

The base is all 0s, so the lower 24 bits and the upper 8 bits should be 0.

impl Descriptor {
const VALID: u64 = 1 << 47;
// NEW:
const BASE_0_23: u64 = 0;
const BASE_24_31: u64 = 0;
}

The granularity should be 1 (for 4Kib units) and the limit should be all 1s, for the biggest segments. The lower 16 bits of the limit are 0xffff and the upper 4 bits are 0xf.

impl Descriptor {
const VALID: u64 = 1 << 47;
const BASE_0_23: u64 = 0;
const BASE_24_31: u64 = 0;
// NEW:
const GRANULARITY: u64 = 1 << 55;
const LIMIT_0_15: u64 = 0xffff;
const LIMIT_48_51: u64 = 0xf << 48;
}

For a 64-bit segment, bit 54 is supposed to be 0.

impl Descriptor {
const VALID: u64 = 1 << 47;
const BASE_0_23: u64 = 0;
const BASE_24_31: u64 = 0;
const GRANULARITY: u64 = 1 << 55;
const LIMIT_0_15: u64 = 0xffff;
const LIMIT_48_51: u64 = 0xf << 48;
// NEW:
const BIT_SIZE: u64 = 0 << 54;
}

A 64-bit code segment must have bit 53 set.

impl Descriptor {
const VALID: u64 = 1 << 47;
const BASE_0_23: u64 = 0;
const BASE_24_31: u64 = 0;
const GRANULARITY: u64 = 1 << 55;
const LIMIT_0_15: u64 = 0xffff;
const LIMIT_48_51: u64 = 0xf << 48;
const BIT_SIZE: u64 = 0 << 54;
// NEW:
const IS_CODE: u64 = 1 << 53;
}

If a segment is executable (like in the case of the code segment), bit 43, bit 3 of the access byte, must be set.

impl Descriptor {
const VALID: u64 = 1 << 47;
const BASE_0_23: u64 = 0;
const BASE_24_31: u64 = 0;
const GRANULARITY: u64 = 1 << 55;
const LIMIT_0_15: u64 = 0xffff;
const LIMIT_48_51: u64 = 0xf << 48;
const BIT_SIZE: u64 = 0 << 54;
const IS_CODE: u64 = 1 << 53;
// NEW:
const EXECUTABLE: u64 = 1 << 43;
}

For a non-system segment, bit 44 (bit 4 of the access byte) must be 1.

impl Descriptor {
const VALID: u64 = 1 << 47;
const BASE_0_23: u64 = 0;
const BASE_24_31: u64 = 0;
const GRANULARITY: u64 = 1 << 55;
const LIMIT_0_15: u64 = 0xffff;
const LIMIT_48_51: u64 = 0xf << 48;
const BIT_SIZE: u64 = 0 << 54;
const IS_CODE: u64 = 1 << 53;
const EXECUTABLE: u64 = 1 << 43;
// NEW:
const NON_SYSTEM_SEGMENT: u64 = 1 << 44;
}

For a code segment to be read or a data segment to be written to, bit 41 (bit 1 of the access byte) must be 1.

impl Descriptor {
const VALID: u64 = 1 << 47;
const BASE_0_23: u64 = 0;
const BASE_24_31: u64 = 0;
const GRANULARITY: u64 = 1 << 55;
const LIMIT_0_15: u64 = 0xffff;
const LIMIT_48_51: u64 = 0xf << 48;
const BIT_SIZE: u64 = 0 << 54;
const IS_CODE: u64 = 1 << 53;
const EXECUTABLE: u64 = 1 << 43;
const NON_SYSTEM_SEGMENT: u64 = 1 << 44;
// NEW:
const READ_WRITE: u64 = 1 << 41;
}

All other bits not considered here are safe to remain 0 (verify by yourself).

Putting all these together, we can create our segment values:

impl Descriptor {
const VALID: u64 = 1 << 47;
const BASE_0_23: u64 = 0;
const BASE_24_31: u64 = 0;
const GRANULARITY: u64 = 1 << 55;
const LIMIT_0_15: u64 = 0xffff;
const LIMIT_48_51: u64 = 0xf << 48;
const BIT_SIZE: u64 = 0 << 54;
const IS_CODE: u64 = 1 << 53;
const EXECUTABLE: u64 = 1 << 43;
const NON_SYSTEM_SEGMENT: u64 = 1 << 44;
const READ_WRITE: u64 = 1 << 41;

// These bits are set by both the code and data segments
const SHARED: u64 = Self::NON_SYSTEM_SEGMENT
| Self::GRANULARITY
| Self::LIMIT_0_15
| Self::LIMIT_48_51
| Self::BASE_0_23
| Self::BASE_24_31
| Self::VALID
| Self::BIT_SIZE
| Self::READ_WRITE;

const CODE_SEGMENT: u64 = Self::SHARED | Self::EXECUTABLE | Self::IS_CODE;
const DATA_SEGMENT: u64 = Self::SHARED;
}

That's it for the code and data segment values. Instead of long strings of digits, we have values resulting from bitwise ORs which tell us the meaning of the bits we're setting and why we're setting them.

Take Away

  • A number m left-shifted n times (m << n) results in the number whose bit representation is the result of shifting m's bits once to the left position n times.
  • 1 << n results in a number whose bit n is set.
  • A bitwise OR between some numbers results in a number whose binary representation has bits set only in positions where at least one of the input numbers has their bits set.

For the full code, go to the repo

In The Next Post

We'll continue with the GDT.

References