├── LICENSE ├── README.md └── diagrams ├── a_points_b.svg ├── array_2d.svg ├── array_strings.svg ├── arrays_diagram.svg ├── housing_db.svg ├── memory_sections.png ├── pointer_to_pointer.svg ├── realloc.svg └── visualise_memory.svg /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Soura Mandal 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Guide to Memory Management and Debugging in C 2 | 3 | By Soura Mandal and Jonah Meggs 4 | 5 | *DISCLAIMER: This guide is intended to provide a summary of some key concepts surrounding memory management in C, to help you gain some background and context, and help you debug memory errors in your program. This guide does not go into significant depth and simplifies some concepts for the sake of clarity, so links to further reading will be provided where necessary. This guide is also unofficial, and not directly affiliated with UNSW.* 6 | 7 | 8 | # Contents 9 | 10 | - [Introduction](#introduction) 11 | - [Pointers](#pointers) 12 | - [Arrays](#arrays) 13 | - [Variable Scope and Passing by Reference](#variable-scope-and-passing-by-reference) 14 | - [Some Context: Stack vs Heap Memory](#some-context-stack-vs-heap-memory) 15 | - [When should memory be dynamically allocated?](#when-should-memory-be-dynamically-allocated) 16 | 17 | - [Memory Allocation and Management](#memory-allocation-and-management) 18 | - [Malloc](#malloc) 19 | - [Free](#free) 20 | - [Calloc](#calloc) 21 | - [Realloc](#realloc) 22 | - [How it works + Time Complexity](#how-it-works--time-complexity) 23 | - [Other Functions](#other-functions) 24 | - [Strings + String Literals](#strings--string-literals) 25 | - [Mallocing Structs](#mallocing-structs) 26 | - [Freeing a Struct](#freeing-a-struct) 27 | - [Mallocing a 2D Array](#mallocing-a-2d-array) 28 | - [Freeing a 2D Array](#freeing-a-2d-array) 29 | - [Mallocing an Array of Strings](#mallocing-an-array-of-strings) 30 | - [Complex Example - Creating a Student Housing Database in C](#complex-example---creating-a-student-housing-database-in-c) 31 | 32 | - [Debugging Memory Errors](#debugging-memory-errors) 33 | - [Common Memory Errors](#common-memory-errors) 34 | 35 | 36 | # Introduction 37 | 38 | A good way to visualise memory is as a [contiguous](https://www.merriam-webster.com/dictionary/contiguous) sequence of bytes, each with their own memory address: 39 | 40 | ![Visualise Memory](diagrams/visualise_memory.svg) 41 | 42 | ## Pointers 43 | 44 | - If you have a variable `var` in your program, `&var` will give you the address that the variable is actually located at within your computer's memory. 45 | - A pointer is a special type of variable that can store memory addresses. 46 | - Pointers are declared using an asterisk following the data type - i.e. `int *ptr;` means that 'ptr' is a pointer to an integer (it stores the memory address of an integer) 47 | - Pointers can be dereferenced to access the value located at that memory address. This is done by prepending an asterisk to the pointer variable. 48 | 49 | It's important to note that pointers themselves (just like any other variable) take up some space in memory, and so have their own memory address. 50 | 51 | Consider the following example where the pointer **`a` points to (stores the memory address of) the variable `b`.** 52 | 53 | ![A points to B](diagrams/a_points_b.svg) 54 | 55 | ```c 56 | #include 57 | int main() 58 | { 59 | int b = 5; 60 | printf("b: %d\n", b); 61 | printf("address of b: %p\n", &b); 62 | 63 | int *a = &b; 64 | printf("address stored in a: %p\n", a); 65 | printf("value retrieved after dereferencing a: %d\n", *a); 66 | 67 | printf("address of a: %p\n", &a); 68 | 69 | return 0; 70 | } 71 | 72 | output: 73 | $ b: 5 74 | $ address of b: 1008 75 | $ address stored in a: 1008 76 | $ value retrieved after dereferencing a: 5 77 | $ address of a: 0002 78 | 79 | // NOTE: In practice, a memory address is a hexadecimal number and is often much longer, something like '0x00001c088'* 80 | ``` 81 | 82 | 83 | > 💡 A good way to read pointer declarations is by reading them backwards - i.e. `int **g;` → `g * * int` → g is a pointer to a pointer to an integer 84 | 85 | 86 | 87 | ## Arrays 88 | 89 | In memory, an array is just a contiguous sequence of bytes that is long enough to hold a specified number and type of element. For example, to declare an array `int nums[5];` which can hold 5 integers (which are each 4 bytes long), you would need ***5 x 4 = 20*** bytes of contiguous memory. 90 | 91 | - The green bytes represent the single block of contiguous memory, the blue brackets show how each integer would be positioned and stored in each slot/index of the array 92 | 93 | ![Arrays Diagram.svg](diagrams/arrays_diagram.svg) 94 | 95 | There are some things to note here: 96 | 97 | - The actual array variable itself - in this case, `nums` - is actually a pointer to the first element of the array (i.e. stores the address of the beginning of the contiguous block of memory) 98 | - The array may just be a single contiguous block, but because this array is declared with type `int` (which is 4 bytes long), the square-bracket access notation will treat every 4 bytes starting from the beginning as one array element. 99 | - This is what allows the statement `nums[2]` to access the third element of the array - it starts from the beginning and moves in two 4-byte increments to reach the address of the third element. 100 | 101 | ## Variable Scope and Passing by Reference 102 | 103 | *Scope* is the area of the program where an item (be it variable, constant, function, etc.) that has an identifier name is recognized. It defines where in the program the variable can be used or referenced. 104 | 105 | When a variable is passed as an argument into a function, the function receives a *copy* of that variable that is *local to the function's scope*. This means that you can't modify the value of the original variable by directly passing it into a function. 106 | 107 | Instead, to be able to modify the value stored in the original variable, the function must receive a pointer to the variable - this is called **passing by reference**. This works because now the function has the address of the original variable - remember that a pointer is just a memory address. 108 | 109 | ```c 110 | void addToInt(int num) { 111 | num += 1; 112 | } 113 | 114 | void addToIntByRef(int *num) { 115 | *num += 1; 116 | } 117 | 118 | int main() { 119 | int x = 5; 120 | addToInt(x); // only the VALUE stored inside x is passed into the function 121 | printf("after addToInt, x = %d\n", x); 122 | 123 | addToIntByRef(&x); 124 | printf("after addToIntByRef, x = %d\n", x); 125 | } 126 | 127 | Output: 128 | $ after addToInt, x = 5 129 | $ after addToIntByRef, x = 6 130 | ``` 131 | 132 | So we can dereference a pointer within a function and change the value at that address. 133 | 134 | **What about changing what a pointer itself points to?** 135 | 136 | To do this, *a pointer to that pointer* must be passed as an argument to the function. Otherwise, the local copy of the pointer will be pointing to something different, but the original pointer will stay the same. 137 | 138 | Consider the following example: 139 | 140 | ![Pointer to Pointer](diagrams/pointer_to_pointer.svg) 141 | 142 | **Alternatively,** the pointer to change can be within a struct, and a pointer to this struct can be passed to the function to achieve the same goal. 143 | 144 | - This is the reason why trees and linked lists often have a `listRep` or `treeRep` struct which holds the `head` or `root` pointer - this structure allows those pointers to be changed by passing a pointer to the treeRep or listRep structure to the modifying function. 145 | 146 | **What about modifying elements of an array by passing it into a function?** 147 | 148 | Since an array itself is already a pointer to the start of a contiguous sequence of memory, **it can be passed directly into a function as an argument, and it's elements can be accessed and modified.** 149 | 150 | Both of the following functions (which aim to set every element in an integer array to 3) are equivalent and valid: 151 | 152 | ```c 153 | void makeAllThrees(int arr[], int length) { 154 | for (int i = 0; i < length; i++) { 155 | arr[i] = 3; 156 | } 157 | } 158 | 159 | void makeAllThrees(int* arr, int length) { 160 | for (int i = 0; i < length; i++) { 161 | arr[i] = 3; 162 | } 163 | } 164 | ``` 165 | 166 | 167 | 168 | ## Some Context: Stack vs Heap Memory 169 | 170 | Your program’s memory is divided up into a number of different sections, usually arranged like so: 171 | 172 | ![Memory Sections](diagrams/memory_sections.png) 173 | 174 | Working from the bottom (address 0x00000000) up: 175 | 176 | - **Code** includes the machine code or instructions that define your program. 177 | - **Data** includes any hardcoded data, for example strings that you define like `"Hello World!\n"` 178 | - **Static** includes any global variables your program uses. 179 | - The **heap** is a dynamic memory region, used by malloc/calloc/realloc. It grows towards the stack. 180 | - The **stack** is another dynamic memory region, used by the OS for local variables. It grows towards the heap. 181 | 182 | 183 | > 💡 A *stack overflow* is when the stack and the heap collide 184 | 185 | 186 | We are mostly concerned with the two dynamic memory regions, the heap and the stack. 187 | 188 | - **Stack memory** has a set of ordered operations for its work. It’s where relevant information about your program goes — which functions are called, what variables you created, and some more information. This memory is managed by the program/operating system and **not by the developer - you don't need to worry about allocating/freeing this memory.** 189 | - **Heap memory** is often used to allocate memory whose size may need to change during runtime, which is supposed to exist as long as the developer wants. That said, **it’s the developer’s job to control the usage of the memory on the heap**. We call this Dynamic Memory. 190 | 191 | You’re placing things on the heap every time you use `malloc`, `calloc`, `realloc` etc to allocate memory for something. 192 | 193 | Any other call that goes like `int i;` or `char str[100];` is placing that in stack memory. 194 | 195 | - Because the program automatically manages stack memory, any variables allocated on the stack do not need to be manually freed. 196 | - However, because the developer takes on the responsibility of any memory allocated on the heap, they **must manually free it** once it is done being used, otherwise **memory leaks** will occur. 197 | 198 | 199 | > 📘 [**Source & Further Reading**](https://www.freecodecamp.org/news/understand-your-programs-memory-92431fa8c6b/) 200 | 201 | 202 | ## When should memory be dynamically allocated? 203 | 204 | ### Size 205 | After memory is allocated for something in the stack, that memory cannot be resized. Hence, the size that is specified at compile time will be the maximum size of the memory for the duration of the program. 206 | 207 | - For singular numerical variables (`int`*,* `char`, `double`*, etc*) stack memory is fine to use - there shouldn't be a need to change the type or size of a variable like this. 208 | - Similarly, if the maximum size of an array is known, and you are certain that the array won't have to change size, declaring this using stack memory is fine as well: 209 | 210 | ```c 211 | int i = 0; 212 | double k = 6.54; 213 | char letter = 'a'; 214 | 215 | // e.g. if the maximum length of a possible string will always be 10 216 | // characters, then allocating a character array like this is fine (one slot 217 | // extra for the terminating null '\0' character that ends the string): 218 | char word[11]; 219 | ``` 220 | 221 | However, you need to use dynamic memory when: 222 | 223 | - You cannot determine the correct fixed size of memory to use at compile time, 224 | - You need to resize the allocated memory during runtime, 225 | - You want to build data structures without a fixed upper size (such as linked lists, trees, graphs etc), 226 | - Or you want to allocate a *very* large object 227 | 228 | For example, consider a program that reads words in from a text file and converts it into a linked list of words - you don't know how many words there are, nor do you know the length of the longest word at compile time. 229 | 230 | This is a situation where dynamic memory allocation is necessary - to allocate a new list node for each word read in, and to allocate a character array to store each word. 231 | 232 | ### Lifetime 233 | The other major reason to use dynamic memory allocation is because you get control over the **lifetime** of the allocated memory. 234 | 235 | If you initialise a stack-allocated variable within a function, it will be local to the function's scope, and it's memory will be automatically freed at the end of the function call. The variable will be undefined if returned from that function. 236 | 237 | If you manually allocate memory for a variable, then the contents of that memory will be usable until explicitly freed, and can be returned, referenced and modified by other parts of your code as needed. 238 | 239 | This is another reason why nodes for linked lists and other dynamic data structures are manually allocated. 240 | 241 | 242 | > 📘 [**Source and Further Reading**](https://stackoverflow.com/questions/18217525/why-or-when-do-you-need-to-dynamically-allocate-memory-in-c) 243 | 244 | # Memory Allocation and Management 245 | 246 | ## Malloc 247 | 248 | **The C library function `void *malloc(size_t size)` allocates a requested amount of memory in the heap, and returns a pointer to the start of that memory (or NULL if the memory cannot be allocated).** 249 | 250 | The general way to use malloc - for instance, to allocate enough memory for an array of 5 integers - is as follows: 251 | 252 | ```c 253 | int *nums = malloc(5 * sizeof(int)); 254 | ``` 255 | 256 | To understand this malloc call better, let's write it out in plain English: 257 | 258 | > *Allocate a block of contiguous memory which is 5 times the size of an integer type (4 bytes). 259 | Return a pointer to the start of this memory, storing it in the integer pointer variable called 'nums'.* 260 | > 261 | 262 | In memory, this array allocation can be visualised just like it was in the 'Arrays' section above. 263 | 264 | The main differences being that: 265 | 266 | - Stack allocated arrays are declared with an integer between square brackets like `int nums[5];`, while a malloced array is defined as a pointer variable `int *nums = ...` . Remember that in both cases, `nums` is a pointer to the start of the contiguous memory 267 | - The stack allocated array cannot be resized during runtime, while the malloced array can be resized using `realloc()` (more on this soon) 268 | 269 | ```c 270 | // initialising all values to 0 271 | for (int i = 0; i < 5; i++) { 272 | nums[i] = 0; 273 | } 274 | // accessing and modifying 275 | nums[3] = 5; 276 | printf("%d\n", nums[3]); // prints 5 277 | ``` 278 | 279 | Each call to `malloc()` is considered to be ***O(1)*** time complexity. 280 | 281 | 282 | > 📘 [**Tutorialspoint Page**](https://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm) 283 | 284 | #### Returning an Array from a Function - Stack vs Heap Allocated 285 | 286 | Only heap-allocated arrays (i.e. malloced arrays) can be validly returned from a function. Stack allocated arrays (i.e. using square-bracket notation) within a function are local to the function's scope, and will not be able to be used after returning. 287 | 288 | The reason for this is described in the [lifetime](#lifetime) section above. 289 | 290 | ```c 291 | // invalid return 292 | int *return_stack_array() { 293 | int arr[10]; 294 | for (int i = 0; i < 10; i++) { 295 | arr[i] = i; 296 | } 297 | return arr; 298 | } 299 | 300 | // valid return 301 | int *return_heap_array() { 302 | int *arr = malloc(10 * sizeof(int)); 303 | for (int i = 0; i < 10; i++) { 304 | arr[i] = i; 305 | } 306 | return arr; 307 | } 308 | 309 | int main() { 310 | int* arr_p; 311 | 312 | // undefined behaviour 313 | arr_p = return_stack_array(); 314 | printf("%d\n", arr_p[9]); 315 | 316 | // correct 317 | arr_p = return_heap_array(); 318 | printf("%d\n", arr_p[9]); 319 | free(arr_p); 320 | 321 | return 0; 322 | } 323 | ``` 324 | > 📘[**Further Reading**](https://stackoverflow.com/questions/68522620/returning-an-array-from-function-in-c) 325 | 326 | ## Free 327 | 328 | **The C library function `void free(void *ptr)` deallocates the memory previously allocated by a call to calloc, malloc, or realloc.** 329 | 330 | This is fairly straightforward - any memory that has been allocated on the heap must be freed properly, or memory leaks will occur. 331 | 332 | A good rule of thumb for this is that there **must be one `free` call for every `malloc/calloc` call in your program.** 333 | 334 | - For example, this means that **if you call `malloc/calloc` in a loop for 10 iterations, you must call `free` 10 times** for each of the variables that the allocated memory was assigned to. 335 | 336 | Usage: 337 | 338 | ```c 339 | int main() { 340 | int *nums = malloc(5 * sizeof(int)); 341 | // doing something with the array, etc 342 | free(nums); // one malloc call, so one free call 343 | } 344 | ``` 345 | 346 | > 📘 [**Tutorialspoint Page**](https://www.tutorialspoint.com/c_standard_library/c_function_free.htm) 347 | 348 | 349 | 350 | ## Calloc 351 | 352 | **The C library function `void *calloc(size_t nitems, size_t size)` allocates a requested amount of memory in the heap, and returns a pointer to the start of that memory (or NULL if the memory cannot be allocated). It then initialises/sets all of the allocated memory to zero.** 353 | 354 | This produces the same result as mallocing an array, and then iterating over it and setting every element to 0. 355 | 356 | Usage: 357 | 358 | ```c 359 | /* 360 | In void *calloc(size_t nitems, size_t size): 361 | nitems − This is the number of elements to be allocated. 362 | size − This is the size of each element. 363 | */ 364 | int main() { 365 | int *nums = calloc(5, sizeof(int)); 366 | 367 | // Which produces the same result as doing: 368 | int *nums = malloc(5 * sizeof(int)); 369 | for (int i = 0; i < 5; i++) { 370 | nums[i] = 0; 371 | } 372 | } 373 | ``` 374 | 375 | Using calloc to allocate memory for a struct containing primitive numeric types and pointers will ensure that all the numeric fields are set to 0 and all pointers are set to NULL. 376 | 377 | 378 | > 📘 [**Tutorialspoint Page**](https://www.tutorialspoint.com/c_standard_library/c_function_calloc.htm) 379 | 380 | ## Realloc 381 | 382 | **The C library function `void *realloc(void *ptr, size_t size)` attempts to resize the memory block pointed to by `ptr` that was previously allocated with a call to `malloc` or `calloc`. It returns a pointer to the newly allocated memory, or NULL if the request fails.** 383 | 384 | As stated above, realloc is what allows dynamic memory to be resized at runtime. This cannot be done with stack allocated memory. 385 | 386 | Usage: 387 | 388 | ```c 389 | /* 390 | In void *realloc(void *ptr, size_t size): 391 | ptr − This is the pointer to a memory block previously allocated with malloc, calloc or realloc to be reallocated. 392 | size − This is the new size for the memory block, in bytes. 393 | */ 394 | int main() { 395 | // the array currently has a length of 5 396 | int *nums = malloc(5 * sizeof(int)); 397 | // after reallocing, it has a length of 10 398 | nums = realloc(nums, 10 * sizeof(int)); 399 | } 400 | 401 | // NOTE: realloc() returns a pointer to the new memory, 402 | // which is why 'nums = realloc(...' is needed 403 | ``` 404 | 405 | ### How it works + Time Complexity 406 | 407 | Realloc works by trying to extend the currently allocated memory in-place, by appending the next few adjacent bytes to the allocation. 408 | 409 | However, the bytes adjacent to the current memory may already be allocated by something else. 410 | 411 | - In this case, realloc has to allocate the specified size of memory in an entirely new location in memory, and then **copy over all the elements in the old location to the new location one by one**. 412 | 413 | ![realloc](diagrams/realloc.svg) 414 | 415 | **Therefore, `realloc()` has a worst case time complexity of ***O(N)*****. 416 | 417 | 418 | > 📘 [**Tutorialspoint Page**](https://www.tutorialspoint.com/c_standard_library/c_function_realloc.htm) 419 | 420 | > [**Source for O(N) Behaviour**](https://en.cppreference.com/w/c/memory/realloc) 421 | 422 | ## Other Functions 423 | 424 | - **memset -** [https://www.tutorialspoint.com/c_standard_library/c_function_memset.htm](https://www.tutorialspoint.com/c_standard_library/c_function_memset.htm) 425 | - **memcpy -** [https://www.tutorialspoint.com/c_standard_library/c_function_memcpy.htm](https://www.tutorialspoint.com/c_standard_library/c_function_memcpy.htm) 426 | - **strcpy -** [https://www.tutorialspoint.com/c_standard_library/c_function_strcpy.htm](https://www.tutorialspoint.com/c_standard_library/c_function_strcpy.htm) 427 | - **strdup -** [https://www.tutorialspoint.com/strdup-and-strdndup-in-c-cplusplus](https://www.tutorialspoint.com/strdup-and-strdndup-in-c-cplusplus) 428 | - **strtok -** [https://www.tutorialspoint.com/c_standard_library/c_function_strtok.htm](https://www.tutorialspoint.com/c_standard_library/c_function_strtok.htm) 429 | 430 | ## Strings + String Literals 431 | 432 | In C, a string is just an array of characters, terminated by a null character (`'\0'`). 433 | 434 | **Stack Allocation** 435 | 436 | ```c 437 | int main() { 438 | // stack allocation of a string (character array) that can hold 3 letters (one extra for the null character at the end) 439 | int n = 3; 440 | char word[n + 1]; 441 | // storing the string "abc" as ['a', 'b', 'c', '\0'] 442 | word[0] = 'a'; 443 | word[1] = 'b'; 444 | word[2] = 'c'; 445 | word[3] = '\0'; 446 | 447 | // some convenient ways to do the same thing as above are: 448 | char word[4] = "abc"; 449 | // OR 450 | char word[4] = {'a', 'b', 'c', '\0'}; 451 | } 452 | ``` 453 | 454 | Each letter of the stack allocated string can be modified, but the string cannot be resized because it is not dynamically allocated. 455 | 456 | **String Literals** 457 | 458 | A string literal is a constant **read-only** string, whose letters cannot be modified after it is declared. String literals live in the **data** section of memory. The string can only be passed as a function argument, or printed, or have it’s individual letters accessed. String literals are declared using a character pointer, and do not need manual memory allocation. 459 | 460 | ```c 461 | int main() { 462 | char *word = "Hello World"; 463 | 464 | // the following will work 465 | printf("%s\n", word); 466 | printf("%c\n", word[1]); 467 | 468 | // this won't work (will likely segfault) 469 | word[1] = 'b'; 470 | } 471 | ``` 472 | 473 | **Heap Allocation** 474 | 475 | ```c 476 | int main() { 477 | int n = 3; 478 | // allocating memory for a string that can hold 3 letters (one extra for the null character at the end) 479 | char *word = malloc((n + 1) * sizeof(char)); 480 | 481 | // unlike stack allocation, only the following will work here to store "abc" 482 | word[0] = 'a'; 483 | word[1] = 'b'; 484 | word[2] = 'c'; 485 | word[3] = '\0'; 486 | // OR 487 | strcpy(word, "abc"); 488 | 489 | // this **is not correct,** it'll just make the pointer *word* point to a string literal 490 | // and lose the reference to the malloced memory 491 | word = "abc"; 492 | 493 | // remember to free the memory once it has been used 494 | free(word); 495 | } 496 | ``` 497 | 498 | > 📘[**Further Reading**](https://www.codingame.com/playgrounds/14213/how-to-play-with-strings-in-c/what-is-a-string-in-c) 499 | 500 | ## Mallocing Structs 501 | 502 | Allocating memory for a struct in C is often used for data structure nodes such as those in linked lists/trees/graphs. 503 | 504 | ```c 505 | #define MAX_NAME 20 506 | 507 | int main() { 508 | // declaring the struct type 'student', whose name is max 20 letters in length, has an age and id 509 | struct student { 510 | char name[MAX_NAME]; 511 | int age; 512 | int id; 513 | }; 514 | 515 | typedef struct student *Student; // defines that writing 'Student' is the same as writing 'struct student *' 516 | 517 | //stack allocated struct 518 | struct student s1 = { 519 | .age = 20, 520 | .id = 1 521 | }; 522 | strcpy(s1.name, "John"); 523 | 524 | // heap allocated struct 525 | // s2 is a pointer to a student struct 526 | Student s2 = malloc(sizeof(struct student)); 527 | strcpy(s2->name, "Bob"); // s2->name is equivalent to doing (*s2).name 528 | s2->age = 20; 529 | s2->id = 1; 530 | 531 | // the following two are INCORRECT, because they are only initialising 532 | // enough memory for POINTERS, not enough for the struct itself 533 | Student s3 = malloc(sizeof(Student)); // sizeof(pointer to the struct) 534 | Student s4 = malloc(sizeof(Student *)); // sizeof(pointer to the pointer to the struct) 535 | 536 | // the following is CORRECT, because s5 is a pointer to the student, 537 | // but the asterisk in front of it in 'sizeof(*s5)' dereferences the pointer 538 | // type to obtain the 'struct student' type 539 | Student s5 = malloc(sizeof(*s5)); 540 | } 541 | ``` 542 | 543 | In the above example, only the struct itself had to be malloced, because all the fields inside the struct were stack-allocated. 544 | 545 | However, **when a struct has fields inside which need to be heap allocated, the struct and each of those fields must be allocated separately:** 546 | 547 | ```c 548 | int main() { 549 | 550 | struct student { 551 | char *name; 552 | int age; 553 | int id; 554 | double *course_marks; // an array of marks for each course the student has done 555 | }; 556 | 557 | typedef struct student *Student; 558 | 559 | // first, allocating memory for the struct itself - this does not allocate memory for the 'name' and 'course_marks' fields 560 | Student s1 = malloc(sizeof(struct student)); 561 | 562 | s1->age = 20; 563 | s1->id = 1; 564 | 565 | // allocating memory for the 'name' string before assigning a value to it 566 | int name_len = 4; 567 | s1->name = malloc((name_len + 1) * sizeof(char)); 568 | strcpy(s1->name, "John"); 569 | 570 | // allocating memory for the 'course_marks' array before assigning a value to it 571 | int num_courses = 7; 572 | s1->course_marks = malloc(num_courses * sizeof(int)); 573 | s1->course_marks[0] = 96; 574 | s1->course_marks[1] = 87; 575 | ... 576 | } 577 | ``` 578 | 579 | ### Freeing a Struct 580 | 581 | When freeing a struct, any dynamic memory allocated within that struct must be freed first, and then the struct itself must be freed. 582 | 583 | ```c 584 | free(s1->name); 585 | free(s1->course_marks); 586 | free(s1); 587 | ``` 588 | 589 | ## Mallocing a 2D Array 590 | 591 | A 2D array can be treated as an array of arrays - that is, it is an array where each element contains a pointer to another array. Therefore, allocating memory for a 2D array involves **a single allocation of memory** for the array that will hold the other arrays, **then a for loop** to individually allocate multiple arrays. 592 | 593 | For example, to allocate memory for a 4 rows x 6 columns array: 594 | 595 | ```c 596 | // stack allocation 597 | int main() { 598 | int arr2d[4][6]; 599 | } 600 | 601 | // heap allocation 602 | int main() { 603 | int nrows = 4; 604 | int ncols = 6; 605 | 606 | int **arr2d = malloc(nrows * sizeof(int *)); 607 | for (int i = 0; i < nrows; i++) { 608 | arr2d[i] = malloc(ncols * sizeof(int)); 609 | } 610 | 611 | // access 2D array elements using arr2d[row][col] 612 | arr2d[2][3] = 7; 613 | arr2d[0][4] = 42; 614 | arr2d[3][5] = 13; 615 | } 616 | ``` 617 | 618 | ![Array 2D](diagrams/array_2d.svg) 619 | 620 | NOTE: Pay attention to the differences between the *sizeof(...)* statements in the code above 621 | 622 | ### Freeing a 2D Array 623 | 624 | Again, every `malloc/calloc` call must have a corresponding `free` call. In this case, the multiple integer arrays that were malloced in the *for* loop should first be freed, followed by the single array of pointers. 625 | 626 | - This can’t be done the other way around, because if the array of pointers is freed first, the references to the multiple integer arrays will be lost. 627 | 628 | ```c 629 | for (int i = 0; i < nrows; i++) { 630 | free(arr2d[i]); 631 | } 632 | free(arr2d); 633 | ``` 634 | 635 | ### Mallocing an Array of Strings 636 | 637 | Since each string is an array of characters, mallocing an array of strings is similar to a 2D integer array. 638 | 639 | For example, mallocing an array of strings that can hold 4 words, each with a max length of 5 characters 640 | 641 | ```c 642 | int main() { 643 | int nwords = 4; 644 | int max_letters = 5; 645 | 646 | int **words = malloc(nwords * sizeof(char *)); 647 | for (int i = 0; i < nwords; i++) { 648 | words[i] = malloc((max_letters + 1) * sizeof(char)); 649 | } 650 | 651 | 652 | strcpy(words[0], "earth"); 653 | strcpy(words[1], "air"); 654 | strcpy(words[2], "fire"); 655 | } 656 | ``` 657 | 658 | ![Array of Strings](diagrams/array_strings.svg) 659 | 660 | ## Complex Example - Creating a Student Housing Database in C 661 | 662 | There are 3 floors with 4 apartment units each in a college apartment building. Each unit can house multiple students. 663 | 664 | We will be modelling this as a 2D array of linked lists of students. Each array index represents one apartment unit, and the linked list of students represents the students assigned to that unit: 665 | 666 | ![Student Housing Database](diagrams/housing_db.svg) 667 | 668 | Consider the student struct being defined as: 669 | 670 | ```c 671 | struct student { 672 | char *name; 673 | int age; 674 | double *course_marks; 675 | struct student *next; 676 | }; 677 | 678 | typedef struct student *Student; // writing 'Student' is the same as writing 'struct student *' 679 | ``` 680 | 681 | After looking at the given scenario, it can be seen that we need to malloc a few things: 682 | 683 | - A 2D array of pointers to student structs 684 | - A new student struct for each student that will be added to the list 685 | - *name* and *course_marks* arrays for each student struct 686 | 687 | In scenarios like this, it is best to abstract repeated malloc calls into functions. This keeps things clean, prevents duplicate code, and allows for parameters to be passed in. 688 | 689 | ```c 690 | Student NewStudent(char *name, int age, int num_courses); 691 | Student **Create2DArray(int nrows, int ncols); 692 | void ListAppend(Student list, Student new); 693 | void FreeAllData(Student **units); 694 | 695 | int main() { 696 | int num_floors = 3; 697 | int num_units = 4; 698 | 699 | // an array of arrays of pointers to student structs --> hence, Student ** (remember that 'Student' 700 | // is the same as 'struct student *') 701 | 702 | Student **units = Create2DArray(num_floors, num_units); 703 | 704 | ListAppend(units[0][1], NewStudent("John", 23, 3)); 705 | ListAppend(units[0][1], NewStudent("Sarah", 21, 2)); 706 | ListAppend(units[0][1], NewStudent("Bob", 18, 2)); 707 | // ... 708 | ListAppend(units[2][3], NewStudent("Bart", 25, 2)); 709 | // ... etc 710 | } 711 | 712 | Student **Create2DArray(int nrows, int ncols) { 713 | Student **arr2D = malloc(nrows * sizeof(Student *)); 714 | for (int i = 0; i < nrows; i++) { 715 | arr2D[i] = malloc(ncols * sizeof(Student)); 716 | } 717 | 718 | return arr2D; 719 | } 720 | 721 | // As mentioned in the *Mallocing Structs* section above: for each struct, 722 | // memory must be allocated for the struct itself, and then any dynamic fields within the struct. 723 | 724 | Student NewStudent(char *name, int age, int num_courses) { 725 | Student stu = malloc(sizeof(struct student)); 726 | 727 | int name_len = strlen(name); 728 | stu->name = malloc((name_len + 1) * sizeof(char)); 729 | strcpy(stu->name, name); 730 | 731 | stu->age = age; 732 | stu->course_marks = malloc(num_courses * sizeof(double)); 733 | stu->next = NULL; 734 | 735 | return stu; 736 | } 737 | ``` 738 | 739 | # Debugging Memory Errors 740 | 741 | The primary tool used to debug memory errors is 'valgrind', a a programming tool for memory debugging, memory leak detection, and profiling. 742 | 743 | The most common Valgrind command to run is: 744 | 745 | ```bash 746 | valgrind --leak-check=full ./ 747 | ``` 748 | 749 | To be able to properly understand Valgrind output, check out the [official UNSW CSE guide](https://www.cse.unsw.edu.au/~learn/debugging/modules/valgrind/). 750 | 751 | --- 752 | --- 753 | 754 | It is also highly recommended that you check out all the [official UNSW CSE debugging documentation](https://www.cse.unsw.edu.au/~learn/debugging/) - they contain extremely useful information on valgrind, gdb, as well as the general process of debugging. 755 | 756 | --- 757 | --- 758 | 759 | ## Common Memory Errors 760 | 761 | [https://www.cprogramming.com/tutorial/memory_debugging_parallel_inspector.html](https://www.cprogramming.com/tutorial/memory_debugging_parallel_inspector.html) 762 | 763 | **Invalid Memory Access / Invalid Write** 764 | 765 | This error occurs when a read or write instruction references unallocated or deallocated memory (i.e. you try to access or modify something that has already been freed) 766 | 767 |
768 | Example 769 |
770 | 771 | ```c 772 | char *str = malloc(25 * sizeof(char)); 773 | free(str); 774 | // Invalid write to deallocated memory in heap 775 | strcpy(str, "Hello World"); 776 | ``` 777 | 778 | ```c 779 | $ ./valgrind --leak-check=full ./ 780 | 781 | Invalid write of size 4 782 | at 0x10917E ... 783 | ``` 784 | 785 |
786 |
787 | 788 | **Memory leaks** 789 | 790 | Memory leaks occur when memory is allocated but not released. Enough leaks will eventually cause the application to run out of memory resulting in a premature termination. 791 | 792 |
793 | Example 794 |
795 | 796 | ```c 797 | char *str = malloc(512 * sizeof(char)); 798 | return; 799 | ``` 800 | 801 | ```c 802 | $ ./valgrind --leak-check=full ./ 803 | 804 | HEAP SUMMARY: 805 | in use at exit: 512 bytes in 1 blocks 806 | total heap usage: 1 allocs, 0 frees, 512 bytes allocated 807 | 512 bytes in 1 blocks are definitely lost in loss record 1 of 1 808 | at 0x4843839: malloc (in /usr/libexec/valgrind/vgpreload_mem...) 809 | by 0x10915E: main (in /home/program) 810 | ... 811 | LEAK SUMMARY: 812 | definitely lost: 512 bytes in 1 blocks 813 | ... 814 | ``` 815 | 816 |
817 |
818 | 819 | **Missing Allocation/Double Free** 820 | 821 | Occurs when freeing memory which has already been freed. 822 | 823 |
824 | Example 825 |
826 | 827 | ```c 828 | char* str = malloc(20 * sizeof(char)); 829 | free(str); 830 | free(str); // results in an invalid deallocation 831 | ``` 832 | 833 | ```bash 834 | $ ./ 835 | 836 | free(): double free detected in tcache 2 837 | 838 | $ ./valgrind --leak-check=full ./ 839 | 840 | Invalid free() / delete / delete[] / realloc() 841 | at 0x484621F: ... 842 | ``` 843 | 844 |
845 |
846 | 847 | **Other Common Memory Errors** 848 | 849 | - trying to modify read-only memory (e.g. string literals) 850 | - attempting to access a field of a null pointer (e.g. curr->next when curr is NULL) 851 | - neglecting to pass by reference 852 | - mallocing only enough space for a pointer to a string/struct instead of the string/struct itself 853 | - trying to write past the bounds of an array 854 | - losing a reference to allocated memory so it cannot be freed 855 | - accidentally doing pointer arithmetic when you meant to add to the dereferenced value 856 | - forgetting to free a string allocated by strdup 857 | - trying to strcpy but not enough space 858 | - using strtok improperly 859 | -------------------------------------------------------------------------------- /diagrams/a_points_b.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 |
name of 
variable
name of...
memory
address
memory...
content
content
0000
0000
0001
0001
1008
1008
0002
0002
0003
0003
0004
0004
a
a
...
...
...
...
1007
1007
5
5
1008
1008
1009
1009
1010
1010
1011
1011
b
b
points to
points to
int b = 5;
int *a = &b; 
int b = 5;...
1012
1012
Viewer does not support full SVG 1.1
-------------------------------------------------------------------------------- /diagrams/array_2d.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 |
42
42
7
7
13
13
= Storing a pointer to an array
= Storing a pointer to an array
= Storing an integer
= Storing an integer
int main() {
int nrows = 4;
int ncols = 6;
int **arr2d = malloc(nrows * sizeof(int *));
for (int i = 0; i < nrows; i++) {
arr2d[i] = malloc(ncols * sizeof(int));
}
// access 2D array elements using arr2d[row][col]
arr2d[2][3] = 7;
arr2d[0][4] = 42;
arr2d[3][5] = 13;
}


int main() {...
Viewer does not support full SVG 1.1
-------------------------------------------------------------------------------- /diagrams/array_strings.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 |
e
e
a
a
r
r
t
t
h
h
\0
\0
a
a
i
i
r
r
\0
\0
f
f
i
i
r
r
e
e
\0
\0
= Storing a pointer to an array
= Storing a pointer to an array
= Storing a character
= Storing a character
Viewer does not support full SVG 1.1
-------------------------------------------------------------------------------- /diagrams/arrays_diagram.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 |
1000
1000
1001
1001
1002
1002
1003
1003
1004
1004
1005
1005
1006
1006
1007
1007
1008
1008
1009
1009
1010
1010
1011
1011
1012
1012
1013
1013
1014
1014
1015
1015
1016
1016
1017
1017
1018
1018
1019
1019
1020
1020
1021
1021
1022
1022
1023
1023
...
...
= 1 Byte
= 1 Byte
Memory
Addresses
Memory...
20 bytes of memory,
enough to store 5 integers.


Every 4 bytes starting from 
the beginning is one array element
 
20 bytes of memory,enough to store 5 integers....
nums = &nums[0] = 1004
nums = &nums[0] = 1004
int nums[5];
int nums[5];
&nums[2] = 1012
&nums[2] = 1012
Viewer does not support full SVG 1.1
-------------------------------------------------------------------------------- /diagrams/housing_db.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 |
name: John
age: 23
course_marks:
[94.2, 91, 53.1]
name: John...
name: Sarah
age: 21
course_marks:
[33.5, 71]
name: Sarah...
name: Jane
age: 34
course_marks:
[]
name: Jane...
name: Steve
age: 27
course_marks:
[64.5]
name: Steve...
name: Daphne
age: 20
course_marks:
[95, 100, 98.7]
name: Daphne...
name: Bob
age: 18
course_marks:
[43.3, 78.9]
name: Bob...
name: Bart
age: 25
course_marks:
[51.3, 76.8]
name: Bart...
= Stores a pointer to an array
= Stores a pointer to an arr...
= Stores a pointer to a struct
= Stores a pointer to a stru...
e.g. Bart lives in floor 2, unit 3

John, Sarah and Bob live in
floor 0, unit 1
e.g. Bart lives in floor 2,...
Viewer does not support full SVG 1.1
-------------------------------------------------------------------------------- /diagrams/memory_sections.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/D0D0123/MemoryGuideC/8f4cf995ea44da5ce2861c2519877e1ac3ae7054/diagrams/memory_sections.png -------------------------------------------------------------------------------- /diagrams/pointer_to_pointer.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 |
3
3
int* b
int* b
void changePointer(int *pointer, int *other_address) {
pointer = other_address;
}
void changePointer(int *pointer, int *other_address) {...
changePointer(b, g)
changePointer(b, g)
local copy of b
local copy of b
Step 1.
Step 1.
3
3
int* b
int* b
changePointer(b, g)
changePointer(b, g)
local copy of b
local copy of b
Step 2.
Step 2.
7
7
int* g
int* g
7
7
int* g
int* g
3
3
int* b
int* b
void changePointer(int **pointer_to_pointer, int *other_address) {
*(pointer_to_pointer) = other_address;
}
void changePointer(int **pointer_to_pointer, int *other_address) {...
changePointer(d, g)
changePointer(d, g)
Step 1.
Step 1.
7
7
int* g
int* g
int** d
int** d
Aim: Make b store the address stored in g (make b point to the same thing as g)
Aim: Make b store the address stored in g (make b point to the same thing as g)
local copy of d
local copy of d
3
3
int* b
int* b
changePointer(d, g)
changePointer(d, g)
Step 2.
Step 2.
7
7
int* g
int* g
int** d
int** d
local copy of d
local copy of d
Function gets
the argument b
as a local copy
Function gets...
Function attempts
to make b point to
g, but only the local
copy of the pointer
changes
Function attempts...
Function gets
the argument d
as a local copy
Function gets...
Function dereferences
d to access the
original pointer b,
and then makes b
store the address
stored in g
Function dereferences...
Viewer does not support full SVG 1.1
-------------------------------------------------------------------------------- /diagrams/visualise_memory.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 |
1000
1000
1001
1001
1002
1002
1003
1003
1004
1004
1005
1005
1006
1006
1007
1007
1008
1008
1009
1009
1010
1010
1011
1011
1012
1012
1013
1013
1014
1014
1015
1015
1016
1016
1017
1017
1018
1018
1019
1019
1020
1020
1021
1021
1022
1022
1023
1023
...
...
= 1 Byte
= 1 Byte
Enough to store a char
Enough to store...
Enough to store an int/float  
Enough to store an i...
Enough to store a double
Enough to store...
Memory
Addresses
Memory...
Text is not SVG - cannot display
--------------------------------------------------------------------------------