├── .gitignore ├── Makefile ├── README.md ├── gc.cpp └── gctests.cpp /.gitignore: -------------------------------------------------------------------------------- 1 | gctests 2 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | CXX=g++ # substitute clang++ if you prefer 2 | 3 | gctests: gctests.cpp gc.cpp 4 | $(CXX) -std=c++11 -O3 -o $@ gctests.cpp 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # GC in 50 lines 2 | 3 | Jason Orendorff 4 | [Nashville Systems Programming Day](http://nashvillecode.org/) 5 | August 16, 2014 6 | 7 | ## Intro 8 | 9 | All the stuff in your program has to be represented in memory somehow: 10 | code, data, arguments, variables. 11 | The task of finding a place for all that stuff 12 | and keeping it organized is called **memory management**. 13 | 14 | Memory management is about knowing where to put things. 15 | And since your computer’s memory is a finite resource, 16 | memory management is also about freeing up memory so it can be reused. 17 | 18 | As an application programmer, you don’t do memory management 19 | for every single object in your program. 20 | But somebody has to. 21 | The system does it for you. Right? 22 | 23 | How does that work? 24 | 25 | Well, there are several ways. 26 | But the usual approach these days is garbage collection. 27 | There’s some software in the system that automatically takes objects 28 | that your program isn’t using anymore and reclaims that memory. 29 | The tricky part, of course, is figuring out which objects 30 | aren’t being used. 31 | 32 | It turns out garbage collection is of a piece with allocation 33 | and other memory management tasks. Not independent. 34 | The GC we show today will include its own allocator. 35 | You’ll see why. 36 | 37 | 38 | 39 | ## The live-coding part 40 | 41 | ### API 42 | 43 | When we design software, it’s best to start with the public parts. 44 | 45 | In fact, a great way is to start with tests, 46 | and the GC we’re looking at today has 47 | [a bunch of tests](https://github.com/jorendorff/gc-in-50-lines/blob/master/gctests.cpp), 48 | written in C++. 49 | If you look at the tests, you’ll see what public features 50 | the garbage collector needs to provide. 51 | 52 | But we can also just figure out the feature set from scratch. 53 | What public features does a garbage collector provide to the user? 54 | 55 | Just two things. 56 | 57 | First, it lets you create new objects. 58 | That’s called allocating memory. 59 | Whenever you make an array, a function, any kind of object 60 | in your higher-level language of choice, 61 | the garbage collector makes that possible. 62 | 63 | So in C, the API for creating a new object might look like this: 64 | 65 | Object* allocate() { 66 | ??? 67 | } 68 | 69 | Details to be filled in later. C code calls this function, 70 | and it returns a pointer to a freshly allocated object. 71 | 72 | It’s not this function’s responsibility 73 | to populate the object’s fields. 74 | Your constructor or initialize method will do that. 75 | This is called before the constructor, 76 | and all it does is find some memory that’s not already being used, 77 | set it aside, and return the address. 78 | 79 | ==> Now. What’s the other feature a garbage collector has to expose? 80 | 81 | Wrong! 82 | 83 | I’m guessing you said something like 84 | “free memory” or “deallocate memory” or “perform GC”. 85 | 86 | That’s not a feature. 87 | Rather, it could be; we could expose a function called 88 | `gc()`—Java has something called `System.gc()`—but 89 | that’s not something users want. 90 | What users want is to allocate objects and do stuff with them. 91 | *GC should just happen automatically,* as needed, 92 | and that’s how our system will work. 93 | 94 | OK. So I did say there were two things a GC allocator has to expose. 95 | The second one is this: 96 | It has to give the user a way to protect objects from being collected. 97 | 98 | ==> What do you think happens if an object gets garbage-collected 99 | while your program is still using it? 100 | 101 | Yeah, in short that would be bad. 102 | 103 | So then, 104 | 105 | ==> How does the garbage collector know which objects are garbage 106 | and which are not? 107 | 108 | You’ve probably never once thought about this, 109 | because in all garbage-collected languages, 110 | the garbage collector knows about every variable. 111 | 112 | That is, when you go to run some JavaScript or Erlang or C# code, 113 | all global and local variables are stored in some kind of data structure 114 | and the GC contains code to walk that data structure 115 | so that the GC can examine every variable. 116 | If it sees a variable that points to a `Pony` object, it says, 117 | aha, that `Pony` is still in use; I’d better not collect it. 118 | Every garbage-collected language stores information *somewhere* 119 | about all variables 120 | for the benefit of the collector. 121 | 122 | C++ does not. 123 | 124 | C++ has no feature “list all the variables” which the GC can use 125 | to inspect variables. 126 | 127 | So to keep things as simple as possible, 128 | what we’ll do is make a single variable 129 | 130 | Object* root = nullptr; 131 | 132 | we’ll call it “root”, for reasons that will become clear later, 133 | and this variable will be special to the GC. 134 | 135 | `nullptr`, by the way, is one of the C++ ways to spell null. 136 | 137 | It starts out null, but the program can store 138 | a pointer to any object in this variable, 139 | and when GC happens, *that* object will be safe. 140 | It won’t be collected. 141 | And of course if that object contains a reference to another object, 142 | well, the program may access that object too --- 143 | so we also leave *that* object alone. And so on. 144 | 145 | Every other object is fair game to be collected. 146 | 147 | This means the application has one rule that it must follow. 148 | Whenever it calls allocate(), any Object it cares about, 149 | any Object it is ever going to use again, 150 | better be reachable from root. 151 | 152 | ==> Why? 153 | 154 | Because everything else is going to be blown away. 155 | 156 | So these are the two public things that a GC provides. 157 | A way to allocate objects, 158 | and a way to protect those objects from being collected. 159 | 160 | Object* root = nullptr; 161 | 162 | Object* allocate() { 163 | ??? 164 | } 165 | 166 | Actually there is just one more thing that our GC will provide, 167 | and that’s a setup function. But we’ll get to that later. 168 | 169 | 170 | ### Object 171 | 172 | Next, let’s define what an object is. 173 | 174 | struct Object { 175 | 176 | We should probably start simple. 177 | The simplest thing that you could possibly build a program with is this: 178 | 179 | struct Object { 180 | Object* head; 181 | Object* tail; 182 | }; 183 | 184 | An object is simply a record containing two pointers to other objects. 185 | 186 | A typical dynamic language actually has many types of objects of varying sizes, 187 | and all kinds of bells and whistles around objects and object allocation. 188 | 189 | But you’d be surprised how much you can do with just two pointers. 190 | If you’ve used Lisp, you probably recognize this as a cons cell. 191 | You can build linked lists and binary trees out of it. 192 | A function in Lisp is also easily implemented as 193 | a garbage-collected record containing two pointers: 194 | one pointer to the code for the function, 195 | and one pointer to the environment, 196 | that is, all the variables the function closes over. 197 | 198 | So let’s go with this for now. 199 | 200 | OK. We will expose one more function in our garbage collector, 201 | and that’s just a setup function to set everything up. 202 | 203 | void init_heap() { 204 | ??? 205 | } 206 | 207 | This completes the API. 208 | This is everything that’s public, 209 | everything the application is allowed to see and use. 210 | 211 | struct Object { 212 | Object* head; 213 | Object* tail; 214 | }; 215 | 216 | Object* root = nullptr; 217 | 218 | void init_heap() { 219 | ??? 220 | } 221 | 222 | Object* allocate() { 223 | ??? 224 | } 225 | 226 | Everything we add from now on is “private”—implementation details. 227 | 228 | 229 | 230 | ### Allocation, the fast path 231 | 232 | All right. 233 | Now is the part where we start implementing the internals of the GC. 234 | And first of all I want to show you where all these Objects that we’re going to manage 235 | are coming from. 236 | 237 | const size_t HEAP_SIZE = 10000; 238 | Object heap[HEAP_SIZE]; 239 | 240 | We’ll just declare one huge global array of ten thousand objects. 241 | Or ten million, whatever. It doesn’t matter. 242 | This means it’s the operating system’s job 243 | to give us one big slab of memory. 244 | Operating systems are good at that. 245 | *Our* job is to parcel this out to the application, one Object per allocate() call, 246 | and to recycle the Objects that aren’t being used anymore. 247 | 248 | So how does allocation work? 249 | There are a great many ways we could do this. 250 | The design I’m going to show you uses what’s called a *free list*. 251 | The key insight here is that 252 | if there’s some memory that the application isn’t currently using, 253 | that memory’s just sitting there, 254 | and the system can use it for whatever it wants. 255 | And we will. 256 | 257 | ==> Who here knows what a linked list is? 258 | 259 | OK. What we’re going to do is make a linked list 260 | of all the Objects in memory that have *not* been allocated. 261 | Initially all the Objects in this array will be in one big 262 | linked list. 263 | 264 | void init_heap() { 265 | for (int i = 0; i < HEAP_SIZE; i++) 266 | add_to_free_list(&heap[i]); 267 | } 268 | 269 | And when you call allocate, it’ll basically just 270 | remove any object from the freelist and return it to you. 271 | 272 | Now I want you to do this part for me. 273 | 274 | Here’s the pointer to the first object in the free list: 275 | 276 | Object* free_list = nullptr; 277 | 278 | When the process starts up, the free list is empty, 279 | and init_heap() is going to fill it up for us. 280 | We know what a linked list looks like: 281 | (draw boxes) 282 | Now we need the code for that. 283 | 284 | void add_to_free_list(Object* object) { 285 | ??? 286 | } 287 | 288 | How can we do that? How about this? 289 | 290 | void add_to_free_list(Object* object) { 291 | free_list = object; 292 | } 293 | 294 | ==> What’s wrong with this? 295 | 296 | That’s not adding to the free list, that’s clobbering the free list. 297 | 298 | void add_to_free_list(Object* object) { 299 | object->tail = free_list; 300 | free_list = object; 301 | } 302 | 303 | Right. That’s all! 304 | If you think about this from a performance perspective, 305 | it’s one read from memory, 306 | because you’re reading a global variable, 307 | and two writes. 308 | Reading from local variables and arguments is essentially free. 309 | 310 | OK. The cool thing now is that our allocate() function 311 | is essentially nothing more or less than the exact opposite 312 | of the code we just wrote! 313 | 314 | Object* allocate() { 315 | Object* p = free_list; // grab the first free object 316 | free_list = p->tail; 317 | return p; 318 | } 319 | 320 | This is two reads and a write, because symmetry. 321 | 322 | And that’s it. 323 | 324 | struct Object { 325 | Object* head; 326 | Object* tail; 327 | }; 328 | 329 | const int HEAP_SIZE = 10000; 330 | Object heap[HEAP_SIZE]; 331 | Object* free_list = nullptr; 332 | Object* root = nullptr; 333 | 334 | void add_to_free_list(Object* object) { 335 | object->tail = free_list; 336 | free_list = object; 337 | } 338 | 339 | void init_heap() { 340 | for (int i = 0; i < HEAP_SIZE; i++) 341 | add_to_free_list(&heap[i]); 342 | } 343 | 344 | Object* allocate() { 345 | Object* p = free_list; // grab the first free object 346 | free_list = p->tail; 347 | return p; 348 | } 349 | 350 | A full garbage collecting memory management system 351 | in just 25 lines of code. 352 | Pretty cool huh? 353 | Thank you for coming. 354 | 355 | 356 | ### When to do GC 357 | 358 | So obviously there is just one thing missing from this system. 359 | It’ll work—we can actually build a program that uses this, 360 | and call allocate(), and get objects, and run code—but 361 | eventually the free list becomes empty. 362 | 363 | ==> Then what will happen? 364 | 365 | In allocate(), we need to detect that the free list is empty. 366 | 367 | ==> Is there a handy way to do that? 368 | 369 | Yes, the free_list variable will be null. 370 | 371 | So if it’s null, that would be a good time to do garbage collection, 372 | try and free up some memory. 373 | 374 | Object* allocate() { 375 | if (free_list == nullptr) { // out of memory 376 | // do gc 377 | } 378 | Object* p = free_list; 379 | free_list = p->tail; 380 | return p; 381 | } 382 | 383 | Of course it’s always possible that the application 384 | simply requires more memory than we’ve got. 385 | Maybe we do garbage collection and nothing shakes loose. 386 | 387 | ==> And then what? 388 | 389 | Well, that means you’re out of memory! 390 | There are several things you could do here. 391 | You could ask the operating system for some more memory. 392 | You could throw an out-of-memory exception. 393 | 394 | Since I’m trying to keep things simple, 395 | I’ll just have `allocate()` return a null pointer here. 396 | This is bad because it means the application has to check the return value 397 | every time it calls `allocate()`. 398 | You never know when the system is going to run out of memory. 399 | But for our toy system, it’ll be fine. 400 | 401 | Object* allocate() { 402 | if (free_list == nullptr) { // out of memory 403 | // do gc 404 | if (free_list == nullptr) // still out of memory! 405 | return nullptr; // give up :( 406 | } 407 | Object* p = free_list; 408 | free_list = p->tail; 409 | return p; 410 | } 411 | 412 | 413 | ### How to do GC 414 | 415 | We have reached the garbage collection portion of the program. 416 | 417 | The kind of GC I’m going to show you is very simple. 418 | It’s called a mark and sweep GC. 419 | And it works like this... 420 | *(explanation)* 421 | 422 | To do this, we need an extra bit per object called the “mark bit”. 423 | For simplicity’s sake, let’s just stick it on the Object. 424 | 425 | struct Object { 426 | Object* head; 427 | Object* tail; 428 | bool marked; 429 | }; 430 | 431 | ==> How much memory does this cost? 432 | 433 | OK. So our scheme is going to be, 434 | first, make sure the mark bit is set to false for all objects, 435 | second, mark all objects that the application is still using, 436 | third, collect all objects that aren’t marked. 437 | 438 | First part first: 439 | 440 | for (int i = 0; i < HEAP_SIZE; i++) // 1. clear all mark bits 441 | heap[i].marked = false; 442 | 443 | Then the mark part. I’m going to make a function to do this for us, 444 | so we only supply the root and it does the whole “flood fill” part. 445 | 446 | mark(root); // 2. mark phase 447 | 448 | Lastly, the sweeping part. That’s easy too: 449 | 450 | for (int i = 0; i < HEAP_SIZE; i++) // 3. sweep phase 451 | if (!heap[i].marked) 452 | add_to_free_list(&heap[i]); 453 | 454 | With that, our allocate function is complete! 455 | 456 | Object* allocate() { 457 | if (free_list == nullptr) { // out of memory, need gc 458 | for (int i = 0; i < HEAP_SIZE; i++) // 1. clear all mark bits 459 | heap[i].marked = false; 460 | mark(root); // 2. mark phase 461 | for (int i = 0; i < HEAP_SIZE; i++) // 3. sweep phase 462 | if (!heap[i].marked) 463 | add_to_free_list(&heap[i]); 464 | if (free_list == nullptr) // still out of memory! 465 | return nullptr; // give up :( 466 | } 467 | Object* p = free_list; // grab the first free object 468 | free_list = p->tail; // remove it from the free list 469 | return p; 470 | } 471 | 472 | All we need is this mark function that implements the mark phase. 473 | You pass it one object, the root, 474 | and it has to set the mark bit on that object 475 | so that the sweep phase knows not to collect it. 476 | 477 | void mark(Object* obj) { 478 | obj->marked = true; 479 | } 480 | 481 | ==> What’s wrong with that? 482 | 483 | void mark(Object* obj) { 484 | mark(obj->head); 485 | mark(obj->tail); 486 | obj->marked = true; 487 | } 488 | 489 | ==> What’s wrong with that? 490 | 491 | void mark(Object* obj) { 492 | if (obj == nullptr) 493 | return; 494 | mark(obj->head); 495 | mark(obj->tail); 496 | obj->marked = true; 497 | } 498 | 499 | ==> What’s wrong with that? 500 | 501 | void mark(Object* obj) { 502 | if (obj == nullptr || obj->marked) 503 | return; 504 | mark(obj->head); 505 | mark(obj->tail); 506 | obj->marked = true; 507 | } 508 | 509 | ==> What’s wrong with that? 510 | 511 | void mark(Object* obj) { 512 | if (obj == nullptr || obj->marked) 513 | return; 514 | obj->marked = true; 515 | mark(obj->head); 516 | mark(obj->tail); 517 | } 518 | 519 | And here’s where we stand now. 520 | 521 | struct Object { 522 | Object* head; 523 | Object* tail; 524 | }; 525 | 526 | const int HEAP_SIZE = 10000; 527 | Object heap[HEAP_SIZE]; 528 | Object* free_list = nullptr; 529 | Object* root = nullptr; 530 | 531 | void add_to_free_list(Object* object) { 532 | object->tail = free_list; 533 | free_list = object; 534 | } 535 | 536 | void init_heap() { 537 | for (int i = 0; i < HEAP_SIZE; i++) 538 | add_to_free_list(&heap[i]); 539 | } 540 | 541 | void mark(Object* obj) { 542 | if (obj == nullptr || obj->marked) 543 | return; 544 | obj->marked = true; 545 | mark(obj->head); 546 | mark(obj->tail); 547 | } 548 | 549 | Object* allocate() { 550 | if (free_list == nullptr) { // out of memory, need gc 551 | for (int i = 0; i < HEAP_SIZE; i++) // 1. clear all mark bits 552 | heap[i].marked = false; 553 | mark(root); // 2. mark phase 554 | for (int i = 0; i < HEAP_SIZE; i++) // 3. sweep phase 555 | if (!heap[i].marked) 556 | add_to_free_list(&heap[i]); 557 | if (free_list == nullptr) // still out of memory! 558 | return nullptr; // give up :( 559 | } 560 | Object* p = free_list; // grab the first free object 561 | free_list = p->tail; // remove it from the free list 562 | return p; 563 | } 564 | 565 | We’re sitting at about 48 lines right now, which means we’re not done. 566 | I suppose we could add some documentation. 567 | 568 | // gc.cpp - A simplistic GC in 50 lines of code (use init_heap, allocate, and root) 569 | 570 | There’s still one lovely little bug in here, which is that 571 | if you allocate an object and root it, 572 | then allocate more objects without rooting them until GC occurs, 573 | GC fails to collect anything! 574 | 575 | ==> Why? How can we fix that? 576 | 577 | (The bug is that the first `allocate()` call returns an object that entrains the entire free list. 578 | The fix is to null out `p->tail` before returning.) 579 | 580 | 581 | 582 | ## Toy GC vs. real GC 583 | 584 | Now that we’ve seen a garbage collector in just 50 lines of code, 585 | the question arises, are all garbage collectors this simple? 586 | 587 | The answer is no... garbage collectors are some of the most complex 588 | software we make. A GC can be thousands of lines of code. 589 | 590 | Why is this garbage collector not serious? 591 | What’s wrong with it? 592 | 593 | * **Objects are just two pointers.** 594 | This is obviously pretty limiting, but now that we have it working, 595 | you can add whatever other fields you want to `struct Object`. 596 | You can add an int here, and a float there, 597 | whatever your application needs, 598 | and the GC will continue to work just fine. 599 | So that’s not really a problem. 600 | 601 | * **All objects have to be the same size.** 602 | This is a problem. 603 | Real applications have objects of different sizes, obviously. 604 | A common solution to this is for the garbage collector 605 | to have several “size classes”, 606 | that is, instead of the heap being one big array of objects of the same size, 607 | and one big freelist, 608 | the heap would contain a few different arrays 609 | for objects of different sizes. 610 | Each different size also gets its own freelist. 611 | 612 | (Note that “size classes” are not the same thing as the things 613 | you declare using the `class` keyword in object-oriented languages.) 614 | 615 | There are other approaches, too. 616 | 617 | * **The mark bit wastes 63 bits per object.** 618 | If that doesn’t sound like a lot, 619 | that just means you’re not a systems programmer. 620 | A real implementation might have separate pages just for mark bits, 621 | with the bits all packed together, 8 bits per byte, no waste. 622 | 623 | Or it might be possible to find a spare bit in the object. 624 | For example, we’ve got two pointers fields, and it turns out 625 | not every bit of a pointer value is really used on x86-64. 626 | So there are maybe 38 bits in this struct that are always zero. 627 | You could use one of those as the mark bit. 628 | 629 | * **The heap size is fixed.** 630 | A real program would start out by saying, hello operating system, 631 | can I have 16 megabytes please? I want to make some objects. 632 | And that would become your initial heap. 633 | 634 | Then, as the application runs, and the heap fills up, 635 | and the GC notices that pretty much everything is live 636 | and it’s not able to collect a lot of objects each time GC happens, 637 | it can just ask the operating system for more memory. 638 | 639 | * **Marking is implemented recursively.** 640 | This means if you create a long linked list, 641 | the next GC will overflow the stack and you’ll crash. 642 | 643 | (Very deep recursion causes crashes in C++.) 644 | 645 | Fixing this would be a good exercise 646 | if you really want to cement all this in your memory. 647 | There’s a standard trick for converting recursive algorithms to iterative ones, 648 | using an explicit stack; 649 | alternatively, there’s a nonobvious trick that involves reversing pointers. 650 | 651 | * **There is only one root.** 652 | 653 | Real garbage collectors have a “root set” 654 | consisting of *all* the objects 655 | currently reachable from outside the heap. 656 | That is, the root set contains all local variables 657 | and all global or static variables 658 | that point to objects. 659 | 660 | But local variables are created and go out of scope all the time. 661 | How does the garbage collector keep track of that? 662 | 663 | The answer is, they have to integrate with the compiler or interpreter 664 | of the programming language they serve. 665 | The language and the garbage collector literally have to coordinate 666 | just so the GC has a way to compute the root set. 667 | 668 | C++ is a huge pain because the C++ compiler 669 | does not coordinate with GC at all. 670 | Why would it? C++ doesn’t have any built-in GC. 671 | So you really don’t want a big C++ codebase using a GC, 672 | because if the compiler won’t integrate with the GC, 673 | how do you get a root set? 674 | Guess what. If the compiler won’t do it for you, 675 | user code has to tell the GC what it’s doing, 676 | which variables it’s using. 677 | This means you end up using smart pointer classes 678 | for all pointers from application code to GC-allocated objects. 679 | It’s an incredible pain, 680 | but this is what V8 does, this is what Firefox does. 681 | 682 | * **There’s no instrumentation.** 683 | A real GC is full of code to measure its own performance 684 | as it’s being used in a real program. 685 | How often is GC happening? How long does it take? 686 | How many objects are reclaimed each time? 687 | 688 | This information is really useful 689 | when you’re trying to make GC as smooth and fast as possible. 690 | 691 | * **It’s slow.** 692 | OK, now we get to the nitty-gritty. 693 | 694 | This is what’s called a stop-the-world garbage collector. 695 | This allocate function is normally very fast, 696 | but whenever it determines that GC is needed, 697 | it walks the entire live object graph, 698 | then reads and writes every single object in the entire heap. 699 | So there will be long GC pauses. 700 | The bigger you make the heap, the longer the pauses. 701 | 702 | This GC only has ten thousand objects in the heap. 703 | On my laptop, this GC takes up to 100 microseconds to run. 704 | That’s really fast. That’s a tenth of a millisecond. 705 | But now scale it up. 706 | Say we had a larger heap, with two million objects in it. 707 | GC would be 20 milliseconds. 708 | If you do that, no matter how rare it is, 709 | in a game that’s trying to maintain 60 frames per second, 710 | you’ve just dropped a frame. 711 | Now think about the same GC running on your phone. 712 | 713 | Performance is the reason GC is a whole field of study, 714 | there are books about it, 715 | at Mozilla we have a GC team. 716 | How do we make this fast? 717 | 718 | So there are a couple of techniques that are 719 | kind of the current state of the art. 720 | 721 | *Incremental GC* spreads out the work so it doesn’t happen all at once. 722 | This doesn’t actually make your system faster, but 723 | the user doesn’t notice GC pauses 724 | if each pause is individually very short. 725 | 726 | *Generational GC* is harder to explain. 727 | It takes advantage of the weird fact that in a modern language, 728 | most objects are extremely short-lived. 729 | This means if you focus on just the most recently allocated objects, 730 | often you can reclaim a bunch of memory 731 | without having to mark and sweep the entire heap. 732 | That saves a lot of time. 733 | 734 | These two techniques can be used together. 735 | Both are well-understood 736 | and have many high-quality implementations. 737 | But it’s tricky stuff. 738 | Some amazing cleverness is required to get these techniques to be 739 | both correct and fast. 740 | 741 | Thanks for taking this trip through a simple garbage collector with me. 742 | 743 | I like this example because it’s short but it’s also packed with cool stuff, 744 | you’ve got linked lists, recursion, pointers, graph algorithms— 745 | it’s rather wonderful that all these beautiful ideas turn out to be so useful. 746 | 747 | 748 | ## Running the code 749 | 750 | You can get the code and build it like this: 751 | 752 | git clone git@github.com:jorendorff/gc-in-50-lines.git 753 | cd gc-in-50-lines 754 | make 755 | ./gctests 756 | 757 | It works. 758 | But the magic is in the reading and thinking, 759 | and not so much the running. 760 | -------------------------------------------------------------------------------- /gc.cpp: -------------------------------------------------------------------------------- 1 | // gc.cpp - A simplistic mark-and-sweep garbage collector in 50 lines of C++ 2 | 3 | struct Object { 4 | Object* head; 5 | Object* tail; 6 | bool marked; 7 | }; 8 | 9 | const int HEAP_SIZE = 10000; 10 | Object heap[HEAP_SIZE]; 11 | Object* root = nullptr; // compile with -std=c++11 to get 'nullptr' 12 | Object* free_list = nullptr; 13 | 14 | void add_to_free_list(Object* p) { 15 | p->tail = free_list; 16 | free_list = p; 17 | } 18 | 19 | void init_heap() { 20 | for (int i = 0; i < HEAP_SIZE; i++) 21 | add_to_free_list(&heap[i]); 22 | } 23 | 24 | void mark(Object* p) { // set the mark bit on p and all its descendants 25 | if (p == nullptr || p->marked) 26 | return; 27 | p->marked = true; 28 | mark(p->head); 29 | mark(p->tail); 30 | } 31 | 32 | Object* allocate() { 33 | if (free_list == nullptr) { // out of memory, do GC 34 | for (int i = 0; i < HEAP_SIZE; i++) // 1. clear mark bits 35 | heap[i].marked = false; 36 | mark(root); // 2. mark phase 37 | free_list = nullptr; // 3. sweep phase 38 | for (int i = 0; i < HEAP_SIZE; i++) 39 | if (!heap[i].marked) 40 | add_to_free_list(&heap[i]); 41 | if (free_list == nullptr) 42 | return nullptr; // still out of memory :( 43 | } 44 | Object* p = free_list; 45 | free_list = free_list->tail; 46 | p->head = nullptr; 47 | p->tail = nullptr; 48 | return p; 49 | } 50 | -------------------------------------------------------------------------------- /gctests.cpp: -------------------------------------------------------------------------------- 1 | // gctests.cpp - Very rudimentary test program for gc.cpp. 2 | 3 | // Start by including the file containing the code we're testing. 4 | // 5 | // This is weird. Ordinarily you wouldn't #include a .cpp file directly. The 6 | // C++ way is to use a header file (we would call it "gc.h") that includes only 7 | // the public API. But the whole point here is to keep the code as tiny as 8 | // possible, and directly including the .cpp file lets us shave off a couple 9 | // lines. 10 | // 11 | // (Also, some of these tests use HEAP_SIZE which is a private detail of the 12 | // garbage collector -- we are sort of busting abstractions here anyway.) 13 | #include "gc.cpp" 14 | 15 | // ## The GC API 16 | // 17 | // Here are the API features we're getting with that #include: 18 | // 19 | // struct Object { // The type of all objects. 20 | // Object* head; // Two fields, pointers to other Objects. 21 | // Object* tail; 22 | // ... // (and a private field used only by the GC) 23 | // }; 24 | // Object* allocate(); // Function that returns a pointer to a fresh new Object, 25 | // // or null if we're out of memory. 26 | // Object* root; // Public variable (!) used to protect an Object from GC. 27 | // 28 | // Note that there is a tricky rule about how to use allocate() and root. 29 | // allocate() will occasionally perform GC, which will wipe out all objects 30 | // that our program isn't using. But how does it know if we're using an object 31 | // or not? Here is the rule: 32 | // 33 | // Whenever we call allocate(), it MAY wipe out all objects 34 | // that are not reachable from `root`. 35 | // 36 | // Therefore our program MUST make sure all objects we care about are 37 | // reachable from `root` BEFORE each call to allocate(). 38 | // 39 | // This tricky rule isn't necessary in normal programming languages like JS or 40 | // C#, because those languages run in virtual machines that automatically keep 41 | // track of all local variables and even temporary results that might be used 42 | // in the future of the program. So when it's time to do GC, the virtual 43 | // machine is prepared to answer the question "which objects does this program 44 | // still have a reference to?" 45 | // 46 | // C++ isn't like that. In optimized builds, it typically doesn't do any 47 | // bookkeeping at all on local variables. The garbage collector therefore has 48 | // no way to ask C++ "what's the root set?" i.e. "which objects does our 49 | // program still have a reference to?" The only way to proceed is for our 50 | // program to *tell* the GC what the root set is, and that's what `root` 51 | // represents. 52 | 53 | #include 54 | #include 55 | #include 56 | 57 | #ifdef NDEBUG 58 | #error "This program uses assert for testing. There's no point building it with assertions disabled." 59 | #endif 60 | 61 | // Test that the GC can at least allocate two objects. 62 | void test_can_allocate_twice() { 63 | // Allocate one object. 64 | Object* obj1 = allocate(); 65 | assert(obj1); 66 | 67 | // Now we're about to allocate another object. This is the first time the 68 | // API rule comes into play: if we do not make sure obj1 is reachable from root, 69 | // then our second call to allocate() would be permitted to perform GC 70 | // and reclaim obj1. In this case we don't want that to happen. 71 | root = obj1; 72 | 73 | // Allocate a second object. Since obj1 is the root, obj2 must be a 74 | // different pointer. 75 | Object* obj2 = allocate(); 76 | assert(obj2); 77 | assert(obj2 != obj1); 78 | 79 | // Set root to null, indicating that there is no root object anymore. 80 | // Every test will do this to clean up after itself. It means "I'm not 81 | // using any objects anymore; consider them all garbage." 82 | root = nullptr; 83 | } 84 | 85 | // Test that the object pointed to by root is not collected and reused. 86 | void test_root_is_not_recycled() { 87 | // Create one object and make it the root. 88 | root = allocate(); 89 | assert(root); 90 | 91 | // Subsequent allocations never return root. 92 | for (int i = 0; i < HEAP_SIZE * 2; i++) { 93 | Object* tmp = allocate(); 94 | assert(tmp != root); 95 | } 96 | 97 | root = nullptr; 98 | } 99 | 100 | // Helper function to allocate and populate an Object all in one go. Only call 101 | // this if you're sure allocation will succeed. If the heap is full and every 102 | // object is reachable, you'll get an assertion failure. 103 | Object* new_object(Object* head, Object* tail) 104 | { 105 | Object* obj = allocate(); 106 | assert(obj); 107 | obj->head = head; 108 | obj->tail = tail; 109 | return obj; 110 | } 111 | 112 | // Test allocate()'s behavior when the heap is full and every Object is 113 | // reachable. 114 | void test_full_heap() { 115 | // Fill up the heap by allocating HEAP_SIZE objects. 116 | root = nullptr; 117 | for (int i = 0; i < HEAP_SIZE; i++) 118 | root = new_object(nullptr, root); 119 | 120 | // The whole heap is reachable. Now allocate() should return null every 121 | // time it's called. 122 | for (int i = 0; i < 4; i++) 123 | assert(allocate() == nullptr); 124 | 125 | root = nullptr; 126 | } 127 | 128 | // Test allocate()'s behavior when the heap is only almost full. 129 | void test_nearly_full_heap() { 130 | // Make the heap nearly full by allocating (HEAP_SIZE - 1) objects. 131 | root = nullptr; 132 | for (int i = 0; i < HEAP_SIZE - 1; i++) 133 | root = new_object(nullptr, root); 134 | 135 | // Now the entire heap is reachable except for one Object. We should be 136 | // able to call allocate() successfully, repeatedly. It returns that one 137 | // object every time it's called! 138 | Object* last = allocate(); 139 | assert(last); 140 | for (int i = 0; i < 10; i++) 141 | assert(allocate() == last); 142 | 143 | root = nullptr; 144 | } 145 | 146 | // Helper function used by some of the tests below. Force garbage collection 147 | // to happen at least once. 148 | void force_gc() { 149 | // Many GCs expose an API to force GC to happen. Ours doesn't. The only 150 | // way to force GC is to allocate objects until we run out of memory, 151 | // making sure to keep the original root rooted throughout. 152 | Object* orig_root = root; 153 | while (Object* obj = allocate()) { 154 | obj->tail = root; 155 | root = obj; 156 | } 157 | 158 | // When we get here, GC has already happened at least once, and the heap is 159 | // completely full---every Object is allocated and reachable from the root. 160 | 161 | // Now put the root set back how it was before, and allocate() one more 162 | // time. This forces GC to happen again, collecting all the garbage 163 | // objects we created above. 164 | root = orig_root; 165 | allocate(); 166 | } 167 | 168 | // Test that objects reachable from root->head or ->tail are not collected. 169 | void test_reachable_objects_not_collected() { 170 | Object* obj1 = root = allocate(); 171 | assert(root); 172 | Object* obj2 = root->head = allocate(); 173 | assert(root->head); 174 | Object* obj3 = root->tail = allocate(); 175 | assert(root->tail); 176 | Object* obj4 = root->head->head = allocate(); 177 | assert(root->head->head); 178 | Object* obj5 = root->head->tail = allocate(); 179 | assert(root->head->tail); 180 | 181 | force_gc(); 182 | 183 | assert(root == obj1); 184 | assert(root->head == obj2); 185 | assert(root->tail == obj3); 186 | assert(root->head->head == obj4); 187 | assert(root->head->tail == obj5); 188 | 189 | root = nullptr; 190 | } 191 | 192 | // Test that the GC is not confused by an object that contains pointers to 193 | // itself. 194 | void test_root_self_references() { 195 | // Create a root object that contains pointers to itself. 196 | root = allocate(); 197 | assert(root); 198 | root->head = root; 199 | root->tail = root; 200 | 201 | force_gc(); 202 | 203 | // After GC, the root object should be unchanged. 204 | assert(root->head == root); 205 | assert(root->tail == root); 206 | 207 | root = nullptr; 208 | } 209 | 210 | // Test that the GC is not confused by cycles in the reachable object graph. 211 | void test_root_cycle() { 212 | // Set up obj1 and root to point to each other. 213 | Object* obj1 = allocate(); 214 | assert(obj1); 215 | root = obj1; 216 | Object* obj2 = new_object(obj1, obj1); // obj2 points to obj1 217 | obj1->head = obj2; // and vice versa 218 | obj1->tail = obj2; 219 | 220 | force_gc(); 221 | 222 | // After GC, the two objects are unchanged. 223 | assert(obj1->head == obj2); 224 | assert(obj1->tail == obj2); 225 | assert(obj2->head == obj1); 226 | assert(obj2->tail == obj1); 227 | 228 | root = nullptr; 229 | } 230 | 231 | // Test that the GC is not confused by cycles that are garbage. 232 | void test_unreachable_cycle() { 233 | // Make a cycle. 234 | Object* obj1 = allocate(); 235 | root = obj1; 236 | Object* obj2 = allocate(); 237 | obj2->tail = obj1; 238 | obj1->tail = obj2; 239 | 240 | // Make the cycle unreachable. 241 | root = nullptr; 242 | 243 | // Allocation should eventually recycle both objects. 244 | bool recycled1 = false, recycled2 = false; 245 | for (int i = 0; i < HEAP_SIZE; i++) { 246 | root = new_object(nullptr, root); 247 | if (root == obj1) 248 | recycled1 = true; 249 | if (root == obj2) 250 | recycled2 = true; 251 | } 252 | assert(recycled1); 253 | assert(recycled2); 254 | 255 | root = nullptr; 256 | } 257 | 258 | int main() { 259 | init_heap(); 260 | 261 | test_can_allocate_twice(); 262 | test_root_is_not_recycled(); 263 | test_full_heap(); 264 | test_nearly_full_heap(); 265 | test_reachable_objects_not_collected(); 266 | test_root_self_references(); 267 | test_root_cycle(); 268 | test_unreachable_cycle(); 269 | 270 | // Each test contains assertions that abort on failure, so if we get here, 271 | // all assertions passed. 272 | puts("Tests passed."); 273 | return 0; 274 | } 275 | --------------------------------------------------------------------------------