Demilade Sonuga's blog

All posts

Heap Allocator I

2023-03-18 · 13 min read

Now that we have a heap, we need an allocator to manage its memory. The main role of the allocator is to keep track of free memory. When we need heap memory, it's the allocator we ask. If there is enough memory, the allocator marks it taken and gives us a pointer to it. When we don't need heap memory anymore, the allocator marks it unused. So, the allocator is just a memory bookkeeper.

There are different ways that an allocator can be implemented. The approach we'll be using is a linked list allocator. For a more in-depth look at allocators and various ways of implementing them, check the references.

Linked List Allocator

This approach entails keeping a linked list of nodes each of which represents a region of free memory. When the allocator is initialized, it starts out with a single node that encompasses all the heap memory available. As chunks of memory are requested and released, more nodes are split, created, and merged.

For example, consider the following.

Suppose this is a heap.

      -----------------------------------------
Heap :|   |   |   |   |   |   |   |   |   |   |
      -----------------------------------------

And each empty box represents 1Kib of space. There are 10 empty boxes, so this is a heap of 10Kib space. For convenience, let's also suppose that each 1Kib block is addressed from 0 to 9. So the first 1Kib block on the left is at address 0, the next is at address 1 and the last on the right is addressed at 9.

Initializing a linked list allocator with this memory will result in a list of one node:

           ----------
Allocator: | 10 Kib | -> Null
           ----------

Since each node in the allocator's list represents a free memory region, the linked list above represents the whole 10Kib of available heap memory. Each node holds the size of the region and a pointer to the next region.

If you think about this, you'll notice that there's a problem here. How will we know which address the region starts from? The answer is we store each node in the region it represents.

In our example above, the heap memory starts at address 0 and ends at address 9. Since the node represents all the available heap memory, the node itself is stored at address 0. This way, information that describes available memory is kept in the available memory itself. To know the address of the memory region that a node describes, you check the address of the node itself.

Now, if we want to allocate 3Kib of memory, the first 3Kib of the heap will be split off for allocation. The allocator's list will now look like this:

      --------------------------------------------
Heap :|Used|Used|Used|   |   |   |   |   |   |   |
      --------------------------------------------

           ---------
Allocator: | 7 Kib | -> Null
           ---------

That is, the list node will be moved to address 3 to indicate that this is where the first free memory region starts from. And the size will be reduced to 7.

A 4Kib chunk request followed by a 2Kib chunk request:

      --------------------------------------------------
Heap :|Used|Used|Used|Used|Used|Used|Used|Used|Used|   |
      --------------------------------------------------

           ---------
Allocator: | 1 Kib | -> Null
           ---------

The first 3Kib is in use somewhere, the next 4Kib is in use in another place, and the next 2Kib is in use in another place, too.

Suppose that the 4Kib chunk is freed. This will lead to the creation of a new node because the regions before and after it are in use:

      ----------------------------------------------
Heap :|Used|Used|Used|   |   |   |   |Used|Used|   |
      ----------------------------------------------

           ---------    ---------
Allocator: | 4 Kib | -> | 1 Kib | -> Null
           ---------    ---------

Now, the first chunk of free memory starts at address 3 and is 4Kib in size. This is represented by the first node in the allocator's list, which is stored in that free memory region at address 3. Its next value is the address of the next free memory region which is address 9.

This next free memory region is 2Kib in size. This is represented by the second node in the allocator's list, which is stored in that free memory region at address 9.

Suppose that the first 3Kib are now freed. Since immediately after it is a free region, we don't create another node, but rather, we merge this new free region with the next one.

      -------------------------------------------
Heap :|   |   |   |   |   |   |   |Used|Used|   |
      -------------------------------------------

           ---------    ---------
Allocator: | 7 Kib | -> | 1 Kib | -> Null
           ---------    ---------

We do this because, if we had not done it, our list will tell us that we have free regions of sizes 3Kib, 4Kib, and 1Kib and if a request demanding 7Kib is made, it will not be satisfied even though we have enough memory for it.

Suppose that the 2Kib chunk is freed. Since there is a free region both before and after it, the three regions are merged to make up one big region.

      -----------------------------------------
Heap :|   |   |   |   |   |   |   |   |   |   |
      -----------------------------------------

           ----------
Allocator: | 10 Kib | -> Null
           ----------

The heap is completely free. And the node is at address 0.

From this example, we have seen almost everything we need to see to implement this allocator. The last thing we need to know is that allocated memory must be aligned for what it's being allocated for.

If you can't remember all this alignment stuff, check this post.

Remember that all values have an alignment. And the address that a value is stored at must be a multiple of its alignment. This implies that sometimes, we won't be able to use available memory.

Think about how you'll implement this yourself. In the next post, we'll be getting down an implementation in code.

Take Away

  • Heap allocators perform the bookkeeping work of managing heap memory.
  • Linked list allocators keep a linked list of available memory regions in the free memory itself.

In The Next Post

We'll be implementing a linked list allocator.

References