## Some notes about dynamic memory allocation of multi-dimensional arrays

Posted: 07/07/2013 in C
Tags: , , , , , , , ,

(Note: In this post I’m assuming that you know what arrays and pointers are and that you already feel confortable working with them.)

Many programming languages such as Pascal have multi-dimensional arrays support. That is, data stored in this kind of arrays can be accessed by indicating its position in the same way we would indicate the position of a point in a geometric space, it is to say, through a $[x_{1}, ..., x_{n}]$ notation, being $n$ the number of dimensions of the space (and, analogically, the number of dimensions of the array we are working in).

The point is that, despite ANSI C specification says so (section 6.5.2.1, point 3), C language doesn’t really have multi-dimensional arrays support. Instead, it uses an array-of-arrays approach; we don’t access to an array element by just separating the indices by commas, but accessing at first one array, then accessing to a second one, which is situated in a particular index, and so on until we get to the desired element of data. In this case we use the notation $[x_1]...[x_n]$.
A picture is worth a thousand words, so let’s see graphically how can we represent a data retrieval in a 2D-array: In this case we have a 5×4 matrix of chararacters in which we want to retrieve the character ‘h’. In this image C defines an array of columns, `a`, which has five columns, `a`, `a`, `a`, `a` and `a`. For each column we have four rows: it is to say, each column is an array as well. For example, for the first column, `a`, we have the elements (rows) `a`, `a`, `a` and `a`. In particular, the character ‘h’ is allocated in the second element of the array `a`, it is to say, in `a`.

The `malloc()` and `free()` functions

You need to keep in mind the C’s approach for representing multi-dimensional arrays so that you can understand how the `malloc()` function works when allocating dynamically memory for multi-dimensional arrays.

The prototype of `malloc()` function is

`void *malloc(size_t size);`

It allocates a block of memory (specifically `size_t` bytes) on the heap and it returns a pointer to the allocated memory. On error, this function returns NULL.

In order to avoid memory leaks during the execution-time of our C programs we have to free all the memory blocks we are not going to use for increasing the perfomance of our programs and even for having always available memory enough for successful calls to `malloc()`. Fortunately C standard library incorporates a function called `free()`, whose prototype is

`void free(void *ptr);`

The `free()` function frees the memory space pointed to by `ptr`, which must have been returned by a previous call to `malloc()`.
It has no return value.

Here’s a very straightforward example of a program that right after allocating dynamically a new 1D-array it frees it:

```#include <stdio.h>
#include <stdlib.h>

int main (void){

int * blocks_of_data;
int number_of_elements = 4;

blocks_of_data = (int *) malloc (number_of_elements * sizeof (int));
free (blocks_of_data);

return 0;
}
```

I will compile and execute this program using Valgrind, which is a tool for Linux that, among other things, helps to detect memory leaks. This is the output I get after the execution-time: This is a successful result, because it says there are no memory leaks in the program. But let’s remove now the `free()` function of the previous code and let’s repeat the same step of above to see the new results of Valgrind’s diagnose. It is to say, the result of compiling and executing the program

```#include <stdio.h>
#include <stdlib.h>

int main (void){

int * blocks_of_data;
int number_of_elements = 4;

blocks_of_data = (int *) malloc (number_of_elements * sizeof (int));

/* Now the "free (blocks_of_data)" is missing here. */

return 0;
}
```

is We have lost 16 bytes (the four integer blocks we reserved through the variable `number_of_elements = 4` multiplied by the size of every block of data of integer type, which is four bytes too per block).
This time the memory leak was very little, only 16 bytes, but it’s really important to not forget to use the `free()` function for every `malloc()` call, specially in serious programs where there is a more significant amount of data management, because although you will recover all the memory lost after the execution-time of the program, problems may arise before, for example because of your system’s memory fragmentation.

Let’s now see how to allocate dynamically a 2D-array in two different ways: the more straightforward but less efficient one, and the not so obvious but more efficient second one. Take a look at the following piece of code:

```#include <stdio.h>
#include <stdlib.h>

int main (void){

int j = 0;

int rows = 4, columns = 5;
int **grid = NULL;

/* Allocating the matrix */

grid = (int **) malloc (columns * sizeof (int *));

for (j = 0; j < columns; j++){
grid[j] = (int *) malloc (rows * sizeof (int));
}

/* Time to free the matrix */

for (j = 0; j < columns; j++){
free (grid[j]);
}
free (grid);

return 0;
}
```

It’s at this right point where the C’s approach of multi-dimensional arrays becomes really clear. If you remember the first image of this post, the first row of that 5×4 grid was the array `a`. That `a` array of that picture is what I re-called here as `grid`.
Look now at line 13 of the code: the `malloc()`‘s function usage is just the same as before, when we only had a 1D-array, but now it was obvious since we had clear that C use the concept of ‘array of arrays’.
So, after we created the first row for `grid`, we are just going to extend each element of `grid` into a new array: it is to say, `grid` is going to be a new array (it will represent the first column of `grid`), `grid` will be a new array as well, and so on. Thus, a for-loop for calling `malloc()` as times as elements has `grid` is needed here.

Once we understood the concept of array of arrays in C, it becomes obvious how the `free()` function should work, but be careful: the order you free the memory does matter.

Let’s see schematically what is going on there: If you pay attention at the code above, we are calling `free (grid)` only after we have called `free ()` for every element of `grid` array, and there is an important reason to do so: if you call `free (grid)` in the first place, you are freeing only the `grid` array, it is to say, you are only freeing the element `a` to element `a`. What does that mean? It means that you are only freeing the first index of each `a[i]` array (being `i` a number between `0` and `4`), it is to say, you are freeing `a`, `a`, `a`, `a` and `a`. But this has important side effects: after you free `a`, you lose each pointer that references each `a[i]` and you can’t free them later. Therefore you will have memory leaks.

Finally, here is the second more efficient method for allocating dynamically an array:

```#include <stdio.h>
#include <stdlib.h>

int main (void){

int i = 0;

int rows = 4, columns = 5;
int **grid = NULL;

/* Allocating the matrix */

grid = (int **) malloc (rows * columns * sizeof (int *));
grid = (int *) malloc (rows * sizeof (int));

for (i = 1; i < rows; i++){
grid[i] = grid[i - 1] + columns;
}

/* Time to free the matrix */

free (grid);
free (grid);

return 0;
}
```

This second method is more efficient because there are only two calls to `malloc()` (and, consequently, two calls to `free()`). Another issue is that we are allocating now `grid` as a large 1D-array of `rows x columns` elements; once we have done this, we just assign to `grid[j]` the memory address of the corresponding block of memory `grid[j - 1]` moved `rows` number of rows. In fact, where we write

`grid[j] = grid[j - 1] + rows`

the compiler is really performing

`* (grid + j) = * (grid + (j - 1)) + rows`.

You can see a practical application of this when I solve the problem 11 of Project Euler (probably in my next post), where I will allocate in a matrix any set of numbers given from the standard input.

References:

Advertisements