Describe the mechanism behind malloc by implementing a simple malloc

Anyone who has used or learned C is no stranger to malloc. Everyone knows that malloc can allocate a contiguous memory space and can be freed by free when it is no longer needed. However, many programmers are not familiar with the underlying mechanisms of malloc, and some even mistakenly believe it to be a system call or a keyword provided by the operating system. In reality, malloc is simply a function provided in the C standard library, and its basic implementation is not complicated. Any programmer with a basic understanding of C and operating systems can easily grasp it.

This article explores the inner workings of malloc by implementing a simple version of it. Although this implementation is not as efficient as the existing C standard library implementations (such as glibc), it is much simpler than the real-world implementations, making it easier to understand. The key point is that this implementation follows the same principles as the actual one.

The article begins by introducing some fundamental concepts, such as how the operating system manages process memory and related system calls. Then, it gradually builds a simple implementation of malloc. For simplicity, the focus will be on the x86_64 architecture running Linux.

1. What is malloc

2. Preliminary knowledge

2.1 Linux Memory Management

2.1.1 Virtual Memory Address and Physical Memory Address

2.1.2 Page and Address Composition

2.1.3 Memory Pages and Disk Pages

2.2 Linux Process-Level Memory Management

2.2.1 Memory Layout

2.2.2 Heap Memory Model

2.2.3 brk and sbrk

2.2.4 Resource Limits and rlimit

3. Implementing malloc

3.1 Toy Implementation

3.2 Formal Implementation

3.3 Legacy Issues and Optimization

4. Other References

1. What is malloc

Before implementing malloc, it's essential to define what it actually does. According to the C standard library, the prototype of malloc is:

void* malloc(size_t size);

The purpose of this function is to allocate a contiguous block of memory in the system, with the following requirements:

- The allocated memory must be at least as large as the size parameter.

- The return value is a pointer to the starting address of the allocated memory.

- The addresses returned by multiple calls to malloc should not overlap unless the previously allocated memory is freed.

- Malloc should complete the allocation and return quickly (it shouldn't use NP-hard algorithms).

- It should also implement functions for memory resizing (realloc) and freeing (free).

For more information about malloc, you can type `man malloc` in the terminal.

2. Preliminary Knowledge

Before implementing malloc, we need to understand some basics of Linux memory management.

2.1 Linux Memory Management

2.1.1 Virtual Memory Address and Physical Memory Address

Modern operating systems typically use virtual memory addressing. Each process seems to have access to a large amount of memory, which is managed by the OS. The actual physical memory is much smaller, but the virtual address space allows programs to operate as if they have more memory available.

2.1.2 Page and Address Composition

In modern systems, both virtual and physical memory are managed in fixed-size blocks called pages. A typical page size is 4096 bytes (4K). This allows the system to manage memory efficiently and map it to physical memory as needed.

2.1.3 Memory Pages and Disk Pages

Memory is often considered a cache of disk storage. When a program accesses a memory page that is not currently in physical memory, a page fault occurs, and the system loads the required data from disk into memory.

2.2 Linux Process-Level Memory Management

2.2.1 Memory Layout

Understanding the relationship between virtual and physical memory helps us see how processes manage their memory. On a 64-bit Linux system, the virtual address space is divided into user space and kernel space.

2.2.2 Heap Memory Model

The heap is where most of the memory requested by malloc is allocated. The heap grows from lower to higher addresses, and the break pointer is used to track the current end of the heap.

2.2.3 brk and sbrk

To increase the size of the heap, the break pointer is moved using the brk and sbrk system calls. These allow the process to request more memory from the OS.

2.2.4 Resource Limits and rlimit

Each process has resource limits, including the maximum amount of memory it can use. These limits can be retrieved and modified using the getrlimit and setrlimit system calls.

3. Implementing malloc

3.1 Toy Implementation

Before diving into a full implementation, we can create a simple toy version of malloc. This version is not suitable for real use but helps reinforce the concepts discussed earlier.

3.2 Formal Implementation

Now, let's move on to a more serious implementation of malloc. We'll start by defining the necessary data structures and then build up the functionality step by step.

3.2.1 Data Structure

We'll use a linked list to manage the heap. Each block consists of a header (meta data) and a data area. The header contains the size of the data area, a pointer to the next block, a flag indicating whether the block is free, and a magic pointer to the data area.

3.2.2 Finding the Right Block

To find an appropriate block, we'll use a first-fit algorithm. This means we search the list of blocks until we find one that is large enough and free.

3.2.3 Allocating New Blocks

If no suitable block is found, we'll extend the heap using the sbrk system call and create a new block at the end of the list.

3.2.4 Splitting Blocks

When a block is allocated, we may split it into two parts if there's enough remaining space. This helps reduce fragmentation and improve memory usage.

3.2.5 Implementing malloc

With the above components in place, we can now implement the malloc function. It will search for a suitable block, split it if necessary, and return the address of the allocated memory.

3.2.6 Implementing calloc

The calloc function allocates memory and initializes it to zero. We can do this efficiently by copying zeros in chunks of 8 bytes.

3.2.7 Implementing free

The free function marks a block as free and merges it with adjacent free blocks to reduce fragmentation. This involves checking the validity of the input address and managing the linked list of blocks.

3.2.8 Implementing realloc

Realloc allows changing the size of an allocated block. If possible, it may reuse the existing block or merge with adjacent blocks. Otherwise, it will allocate a new block and copy the data over.

3.3 Legacy Issues and Optimization

While the current implementation is functional, there are many areas for improvement. For example, it could be made compatible with both 32-bit and 64-bit systems, or optimized for large allocations using mmap instead of sbrk. Additionally, maintaining multiple lists based on block sizes could significantly improve performance and reduce fragmentation.

4. Other References

This article draws heavily from "A Malloc Tutorial" and other resources like "Computer Systems: A Programmer's Perspective." For further reading, consider exploring the Linux kernel's memory management documentation and the glibc implementation of malloc.

LED Screen Dispaly

LED Screen,Led Video Wall Panel,Led Video Panels,Video Wall Panels

Shanghai Really Technology Co.,Ltd , https://www.really-led.com