Dynamic arrays in C
by Michał Węglarz
Inspired by this viral video, I've decided to try to develop my own minimal example of a dynamic array in the C programming language. The goal of this exercise is to grow the basic understanding of fundamental data structures and build my own internal toolset of useful code snippets that I might revisit and reuse in the future.
The standard array
The regular array of five integers in C looks like this:
int arr[5] = {1, 2, 3, 4, 5};
It is a fixed-size collection of data items of the same type. The size of an array cannot be changed once it is declared. This limitation, which doesn't exist in higher-level languages, may be frustrating, but it's not difficult to overcome by implementing our own dynamic array.
Representing an array as a struct
The implementation starts with a basic struct, that wraps the pointer to the
first item in the array, and specifies fields
such as the current length (the actual number of meaningful items currently stored in the
array) and the total capacity.
typedef struct {
int *items;
size_t length;
size_t capacity;
} IntArray;
This struct is enough to represent the dynamic array consisting of a variable number of integers.
I can now initialize it, ensuring all members are zeroed by default:
IntArray numbers = {0};
Adding an item to the array
I'll create a new function responsible for adding a new item to the array and call it
IntArray_push. It takes a pointer to the IntArray as
a first parameter and the item to insert as a second one.
void IntArray_push(IntArray *arr, int item)
{
arr->items[arr->length] = item;
arr->length += 1;
}
The function inserts the item at the end of the array and adjusts the length appropriately.
It is missing some crucial parts, though. Before I can actually proceed with adding to the array, there's a series of necessary questions I have to answer first.
What if the array is already at its maximum capacity?
The array is full when its length is equal to the capacity.
if (arr->length == arr->capacity) {
In such a case, it's necessary to expand the array.
This expression also evaluates to true when both length and capacity are 0,
which
occurs on the first push after initialization.
What to do when the array's capacity is zero?
I have to manually set the base capacity to some value.
if (arr->capacity == 0) arr->capacity = 256;
It doesn't really matter what value I choose here, but it's a good idea to think about the real-world datasets I'm going to work with and select the capacity based on that. As a general rule of thumb, the less frequently we get to resize it later on, the better for performance. I want to reduce the number of necessary resizing operations, which may happen to be more costly than I'd wish for.
In an ideal world, all of my data structures would have a static size known at the compile time.
How do I ensure the array has sufficient capacity for the new item?
If the base capacity has already been specified in the previous step, but I get to the point where the new item I want to insert won't fit into the array, I need to manually adjust the capacity. As it's easy to guess, the new capacity has to be sufficiently larger than the previous one. Here I decide to double it:
arr->capacity *= 2;
Similarly to the previous case, the value of two is somewhat arbitrary—other values could be fine too. The point is to find such a number that will fulfill the minimum requirements for the new value to fit in the array, without increasing the number of future resizing operations too much.
It's important to note that this only updates the capacity value—it doesn't yet resize the memory allocated for the array.
How do I actually resize the array memory?
I'm going to use realloc to request a new amount of memory, equal to the updated
capacity multiplied by the size of each item.
int *new_ptr = realloc(arr->items, arr->capacity * sizeof(*arr->items));
realloc returns the pointer to the beginning of newly allocated memory.
I am going to make sure the allocation went successfully and terminate the program if the new pointer isn't a valid memory address.
if (new_ptr == NULL) {
fprintf(stderr, "Memory reallocation failed\n");
exit(EXIT_FAILURE);
}
Otherwise, if all goes well, I can safely update the array pointer:
arr->items = new_ptr;
It is worth noting here that realloc can also be used for initial memory
allocation (when arr->items is NULL), since realloc(NULL, size)
behaves like malloc(size).
Putting it together
Now that I've got the necessary pieces, the entire code required to safely push to the array and resize it, in case of insufficient capacity, looks like this:
void IntArray_push(IntArray *arr, int item)
{
if (arr == NULL) {
fprintf(stderr, "Array pointer is NULL\n");
exit(EXIT_FAILURE);
}
if (arr->length == arr->capacity) {
if (arr->capacity == 0)
arr->capacity = 256;
else
arr->capacity *= 2;
int *new_ptr = realloc(arr->items, arr->capacity * sizeof(*arr->items));
if (new_ptr == NULL) {
fprintf(stderr, "Memory reallocation failed\n");
exit(EXIT_FAILURE);
}
arr->items = new_ptr;
}
arr->items[arr->length] = item;
arr->length += 1;
}
I've thrown in a NULL check at the beginning, just to be safe. The base capacity
initialization
happens
inside the if
statement—this triggers on the first push when both length and capacity are 0,
which is when realloc does the initial allocation.
Retrieving an item from the array
The IntArray_get helper function allows me to safely retrieve an item from the array based
on
its
index by using pointer arithmetic. It includes a NULL check and verifies that the
provided index is valid, preventing out-of-bounds
access.
If the index is invalid, instead of dealing with undefined behavior, I want to fail gracefully, by returning a value that would clearly tell me that something went wrong (e.g. -1 as an error indicator), or even terminate the whole program.
int IntArray_get(IntArray *arr, int index)
{
if (arr == NULL) {
fprintf(stderr, "Array pointer is NULL\n");
return -1; // or exit(EXIT_FAILURE)
}
if (index < 0 || index >= arr->length) {
fprintf(stderr, "Invalid array access at index %d\n", index);
return -1; // or e.g. exit(EXIT_FAILURE);
}
return *(arr->items + index);
}
Usage
int main()
{
IntArray numbers = {0}; // Initialize the dynamic array
for (int i = 0; i <= 267; i++) {
IntArray_push(&numbers, i); // Fill the array with values
}
for (int i = 0; i < numbers.length; i++) {
int item = IntArray_get(&numbers, i); // Retrieve values from the array
printf("%d\n", item);
}
int item = IntArray_get(&numbers, 567); // Invalid array access at index 567
printf("%d\n", item); // -1
return 0;
}
Handling different data types
So far, I've implemented a dynamic array that can store elements of type int. But what if
I want to use any other type? Since C doesn't have a built-in feature like TypeScript's
generics
or C++ templates, do I have to
manually redefine all of the above whenever I want to store a different type of elements?
Manual approach
I can manually copy the code above, and replace only the
appropriate pieces with the new type, e.g. replace int with float. It is
not the "cleanest" solution, as I impose on myself a fair amount of manual
labor of copying and pasting the same code and ultimately end up with some repetitive fragments.
It is the most straightforward solution, though. It allows me to pragmatically focus only on the data
I'm
actually dealing with, rather that prematurely introduce complexity that I might not even make use of.
typedef struct {
int *items;
size_t length;
size_t capacity;
} IntArray;
void IntArray_push(IntArray *arr, int item) { ... }
int IntArray_get(IntArray *arr, int index) { ... }
typedef struct {
float *items;
size_t length;
size_t capacity;
} FloatArray;
void FloatArray_push(FloatArray *arr, float item) { ... }
float FloatArray_get(FloatArray *arr, int index) { ... }
Macros
Another option is to use C macros. They provide a way to work with different data types, although they don't provide type safety and can become difficult to write, understand, and debug. Additionally, since macros don't return values like functions, the most convenient (but a little awkward) approach is to pass an output variable as a parameter.
#define ARRAY_PUSH(arr, item) \
do { \
if ((arr)->length == (arr)->capacity) { \
if ((arr)->capacity == 0) \
(arr)->capacity = 256; \
else \
(arr)->capacity *= 2; \
void *new_ptr = realloc((arr)->items, (arr)->capacity * sizeof(*(arr)->items)); \
if (new_ptr == NULL) { \
fprintf(stderr, "Memory reallocation failed\n"); \
exit(EXIT_FAILURE); \
} \
(arr)->items = new_ptr; \
} \
(arr)->items[(arr)->length] = (item); \
(arr)->length += 1; \
} while (0)
#define ARRAY_GET(item, arr, index) \
do { \
if ((index) >= 0 && (index) < (arr)->length) { \
item = *((arr)->items + (index)); \
} else { \
fprintf(stderr, "Invalid array access at index %d\n", (index)); \
item = (typeof(item))(-1); \
} \
} while (0)
#define DEFINE_ARRAY(type, name) \
typedef struct { \
type *items; \
size_t length; \
size_t capacity; \
} name
int main()
{
DEFINE_ARRAY(float, GenericFloatArray);
GenericFloatArray numbers = {0};
for (int i = 0; i <= 510; i++) {
ARRAY_PUSH(&numbers, i + 0.5f);
}
for (int i = 0; i < numbers.length; i++) {
float item;
ARRAY_GET(item, &numbers, i);
printf("item: %f\n", item);
}
float item;
ARRAY_GET(item, &numbers, 921); // Invalid array access at index 921
printf("item: %f\n", item); // -1
return 0;
}
Summary
While there are already many different solutions to the problem of dynamic arrays available online, with many solid and well-tested implementations, I still find it a good learning exercise to "reinvent the wheel" a little and go through the process of writing this code myself. It's certainly easier than one might think, and with this code in place, I'll be happy to reuse it in future projects.
The entire code from this blog post can be found here.
Related resources: