Demilade Sonuga's blog

All posts

Box II

2023-03-30

In this post, we're going to implement a Box.

Implementation

The role of the Box is to allocate space for data on the heap, store that data, and release the heap space when the data is no longer needed. To do this, only two bits of information need to be kept track of: the pointer to the data on the heap and a pointer to the allocator (for deallocation when the data is no longer needed).

Create a new file boxed.rs and:

use crate::allocator::LinkedListAllocator;


pub struct Box<T> {
    ptr: *mut T,
    allocator: *mut LinkedListAllocator
}

Remember from the previous post that Box must be able to hold data of any type. We achieve that with the generic T, telling the compiler that ptr is a pointer to any type of data.

The allocator field, too, is needed for the allocation of data when a new Box is created and deallocation when the data is no longer needed.

For a refresher on references, lifetimes, and generics you can check the references.

Functions we're going to use to create and handle the Box:

  1. new for creating a new Box.
  2. into_raw for retrieving the underlying pointer.
  3. from_raw for creating a new Box from a pointer and allocator.

We'll also have to implement some useful traits: Drop, PartialEq, Deref, and DerefMut

Starting from new. To create a new Box, we need the data that will be stored on the heap and the reference to the allocator itself, which will be used for allocating and deallocating heap memory.

impl<T> Box<T> {
    // Creates a new heap allocated value
    pub fn new(val: T, allocator: *mut LinkedListAllocator) -> Box<T> {

    }
}

Now, in the function, we first attempt to allocate space for val on the heap with the given allocator. If it's successful, the new Box is created and returned. If it's not successful, a panic with an error message will result.

use crate::allocator::LinkedListAllocator;
use core::mem; // NEW
impl<T> Box<T> {
    pub fn new(val: T, allocator: *mut LinkedListAllocator) -> Box<T> {
        // Allocate heap memory for `val`
        let alloc_result = unsafe { (*allocator).alloc(mem::size_of::<T>(), mem::align_of::<T>()) };
        match alloc_result {
            // Heap allocation succeeded
            Some(ptr) => {
                // Interpret the pointer as a pointer to T
                let ptr = ptr as *mut T;
                // Move `val` into the memory just allocated for it
                unsafe { *ptr = val };
                Box {
                    ptr,
                    allocator
                }
            }
            // Heap allocation failed
            None => panic!("No enough space on the heap")
        }
    }
}

The into_raw function "consumes" the Box, returning the pointer to the data on the heap.

impl<T> Box<T> {
    // Consumes the box, returning the underlying pointer to the data
    pub fn into_raw(b: Box<T>) -> *mut T {
        use core::mem::ManuallyDrop;
        // Drop should not be called, so that deallocation won't take place
        let b = ManuallyDrop::new(b);
        b.ptr
    }
}

The from_raw function, the opposite of into_raw:

impl<T> Box<T> {
    // Creates a box from a raw pointer to a value already on the heap
    // and an allocator.
    // The caller has to ensure that `ptr` is pointing to a valid area on the heap
    pub unsafe fn from_raw<U>(ptr: *mut U, allocator: *mut LinkedListAllocator) -> Box<U> {
        Box {
            ptr,
            allocator
        }
    }
}

To handle deallocation, we implement the Drop trait. This trait has the method drop which is called whenever data goes out of scope. If you aren't familiar with this trait, check the references.

Our Box implementation of Drop is just a simple matter of calling the allocator's dealloc function.

impl<T> core::ops::Drop for Box<T> {
    fn drop(&mut self) {
        unsafe { (*self.allocator).dealloc(self.ptr as *mut u8, mem::size_of::<T>()); }
    }
}

Deref and DerefMut indicate structures that can be dereferenced. Deref for immutable dereferencing and DerefMut for mutable dereferencing.

For example, suppose struct Struct implements Deref but not DerefMut, then this is valid:

let s: Struct = Struct::new();
let derefenced_value = *s;

But this is not valid:

*s = Struct::new();

Because Deref is only for immutable dereferencing. The above will only be valid if DerefMut, too, is implemented.

The relevance of these structs to our Box is apparent. Box is actually a smart pointer, meaning it is a structure that behaves like a pointer but with other special capabilities. If you're not familiar with the idea of smart pointers, check the references.

As such, Box needs to be able to dereference to the underlying data as if it was a regular reference.

Implementing Deref and DerefMut:

impl<T> core::ops::Deref for Box<T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        unsafe { &*self.ptr }
    }
}
impl<T> core::ops::DerefMut for Box<T> {
    fn deref_mut(&mut self) -> &mut Self::Target {
        unsafe { &mut *self.ptr }
    }
}

Now, whenever a dereference operation is performed on a Box, it is the references returned by these deref and deref_mut functions that will be dereferenced.

For convenience, we'll also implement PartialEq, the trait that indicates anything that can be compared using the relational operators.

To be able to compare a Box to another Box that holds values of the same type (between Box<T> and Box<T>):

use crate::allocator::LinkedListAllocator;
use core::mem;
use core::cmp::PartialEq; // NEW
impl<T: PartialEq> PartialEq<Box<T>> for Box<T> {
    fn eq(&self, other: &Box<T>) -> bool {
        unsafe { *self.ptr == *other.ptr }
    }
}

This implementation dereferences the pointers of the Boxes and compares the values behind them.

To be able to compare a Box to data of the type the Box is holding (between Box<T> and T):

impl<T: PartialEq> PartialEq<T> for Box<T> {
    fn eq(&self, other: &T) -> bool {
        unsafe { *self.ptr == *other }
    }
}

That's it for this bare minimum implementation. In the next post, we'll be testing this.

Take Away

For the full code, go to the repo

In The Next Post

We'll be testing the Box

References

  • Generics: https://doc.rust-lang.org/book/ch10-01-syntax.html
  • References: https://doc.rust-lang.org/book/ch04-02-references-and-borrowing.html
  • Lifetimes: https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html
  • Drop: https://doc.rust-lang.org/rust-by-example/trait/drop.html
  • Smart pointer: https://doc.rust-lang.org/book/ch15-00-smart-pointers.html