Monday, December 31, 2012

3...2..1... Collect the Trash!

Ok so y'all probably remember wayyyyy back in October when I was discussing writing a garbage collector. Well, I really don't expect you to remember that. "You" being the pronoun referring to my two subscribers (one being a stalker). Ok, yeah I flatter myself, in reality, it's probably not a stalker of physical attraction, rather a psychotic blood thirsty individual. *waves*

Well, why do I want to dance with a garbage collector? Isn't that the evil-bastard-child/antithesis to region based memory management (RBMM), what I have been working on the past two years or so? Nope! Garbage collection can be a compliment. See, RBMM works by analyzing a program at compile-time (my system uses a gcc-plugin). From this analysis the compiler can insert memory allocation and reclaim calls (think malloc/free) at compile time. My system automatically free's memory, based on which objects point to which other objects. These groupings of objects can come from the same region of memory. A region cannot be reclaimed until all objects in it are no longer needed. Then, in one quick swoop, the memory of the region (the space that all the objects take-up) can be reclaimed. This fast memory reclaim is a key benefit of RBMM, as there is no need to visit each object and reclaim its memory individually.

Unfortunately, not all object lifetimes can be decided at compile time, such as global variables, or values pointed to/aliased by globals. This means that some objects cannot be reclaimed based on a compile-time static analysis. If these undecidable-lifetime objects are in a region, then that region lives for the entire lifetime of the program. Static analysis cannot force a region to die, unless it can be 100% sure that the region contents (objects) are never needed again. Global nastyness creates region bloat, as some objects might be reclaimable, and not others. As I previously mentioned, a region cannot be removed until ALL of its objects are no longer alive. If a global variable pops into the region, then that region now lives forever. Like a virus.

The above issue means that a garbage collector can actually benefit an RBMM system. That is what I have been working on: a combining of RBMM and garbage collection. This is not unique, and has been done before; however, the way we implement our collector is unique, but that is not discussed in this post.

Disclaimer: Grammar nuts please note that this is a blog post, so I am immediately relieved of any misuse of narrative modes. This is a design for my PhD and something that I have worked on with my advisers quite a bit (the "we"). The code is mine.

What I want to discuss in this post is how we locate stack variables. Stack variables form a subset of the root-set. The root-set being a collection of variables that are live at the time of garbage collection. If any data is accessible through the root-set then that object is also alive and must survive garbage collection. Else, the memory for data/objects that is not reachable can be collected. I will not go into details, or even the collector type I have chosen (e.g. mark-sweep, or copy, etc). Instead, I just want to talk about finding the stack variables; as I said a sub-set of the root set. Globals and registers also can be part of the root-set, but this post is only about stack variables </redundancy>.

When the garbage collector kicks-in the collector needs to start somewhere. It starts by scanning the root-set. Well, how does the collector get this root set? Specifically, what this post is concerned with is obtaining a sub-set of the roots, the stack variables. If you are familiar with stack-frames or activation records, the values there in the stack must be scanned. If any of those variables point to objects dynamically allocated (e.g. heap variables or variables from a region), then such objects must survive collection, and their memory cannot be recycled.

I have implemented a compiler modification (gcc-plugin) which provides hints to the garbage collector as to how to trace the stack of the program. This is how the collector gets the subset of the root-set, the stack variables. Actually, it is so cool and nifty I warranted this entire blog post to the notion.

My aim is to provide, at compile-time, a table that can be read by the garbage collector at runtime. During compile-time, as the source-code is being processed my plugin locates all function calls in the program. At the return of each function call a label is inserted (think assembly goto/label). This label is unique for each callsite, and makes up the key into a table that I am building. This key points to information about the function that the label is inside. For the example just below, the three labels are unique, but all point to the same function information. Information about my_function(), which specifies what local variables it has (x, y, z), if the variables are pointers, and how big the variables are.

void my_function(void)
    int x;
    float y;
    double z;



The function above has three callees (foo, bar, baz). At each call site, my plugin inserts a label. Now, why do I insert labels? Well, at runtime we need a way of associating the return point for each function call (foo, bar, baz). At compile time I do not have any addresses to use, hell, addresses are created by the OS when the executable is loaded; we haven't even finished compiling the source code yet! The solution is to insert a label which can be used at compile time to build a table of stack-variable information. The loader will fix-up the addresses, and at runtime the labels will have the correct address that the function call returns to. Recall that each time a function is called, the return instruction pointer RIP or return address will be stored on the stack. This value will be the same as the respected label. For instance, consider that my_function() is being executed, and the line which calls bar() is executed. When bar() gets called, the return point, which now happens to also match __label1, is placed onto the stack. And when bar() returns, the return point is now the same as __label1. The only thing we have done so far is insert labels. No big deal. Note: I doubt anyone except for maybe one person has read this far, if you do, email me, you kick so much ass, that next time we meet, I will buy you coffee produced from the most lavish and rare of coffee seeds: unicorn digested coffee. I will then serenade you with a ear-piercing song and proclaim you Captain Badassery and dance around bear-chested wearing a Burger-King crown and throw pennies at your feet.

As mentioned, the labels are entries into a table that contains function information (variables, stack size, etc). When the garbage collector kicks-in, it looks at the stack and locates the return point. That value is a key into a function table my plugin generated, remember the RIP coincides with the added labels. The collector uses the RIP/label as a key to get function information, and then processes the variables (figuring out which heap values can be accessed though the local variables in that function). Once that function is scanned, its caller function is scanned, and so on... it's turtles all the way down... until the collector hits main(). Then this portion of garbage collection is complete.

TLDR; Anyways, the above diatribe is an expose into the use of labels for providing a means of accessing stack variables, and providing a compile-time generated way of performing a runtime stack trace.

Have an awesome New Year and may all of your garbage be collected properly!



  1. Am I the stalker or the other reader? Not sure. This being the internet, I could be a dog.

    Anyway, this sounds super cool/hard, and I can't wait to hear more about it. And get some of that unicorn coffee.

  2. Nah, you are not the stalker I had in mind. But I am pleased you read this. It humbles me. And yes, pennies will be tossed at your feet.