Been working on teasing some results from the RBMM (region-based memory management) stuff I have been working on. I am really excited, not so much with the data, but at the mad LaTeX table I just cranked out.
-Matt
Wednesday, August 24, 2011
Saturday, August 20, 2011
"It's good to warm my bones beside the fire"
Pink Floyd? Yeah? Bueller? Bueller? That's right 757. Be prepared, grand master pimptastic from the Down-Under is planning to do it up in the 757 for a few weeks. Booked a flight home from August 31st to September 16th.
Home, home again
I like to be here when I can
When I come home cold and tired
It's good to warm my bones beside the fire
Far away across the field
The tolling of the iron bell
Calls the faithful to their knees
To hear the softly spoken magic spells
-- Pink Floyd
Home, home again
I like to be here when I can
When I come home cold and tired
It's good to warm my bones beside the fire
Far away across the field
The tolling of the iron bell
Calls the faithful to their knees
To hear the softly spoken magic spells
-- Pink Floyd
Thursday, August 18, 2011
Non-deterministic Memory Management
So when the implementation mandates that all returned memory should be zero'd... it should be. In other words, if the compiler distributes memory via a routine that does not zero memory a'la malloc(), the compiler should wrap the memory allocator and zero data, or use calloc(). The result can be a non-deterministic executable.
I wonder, since Bell's theorem suggests that even if we account for all the local hidden variables, we still cannot account for the non-determinism seen in quantum mechanics. Therefore, Heisenberg's Uncertainty Principle seems to say that, yes, despite what makes sense to me, the universe does have non-deterministic properties. So, I suppose we could bubble-this up to the compiler world and memory management (this reduction might be grossly wrong) and suggest that non-determinism really does exist if a memory management routine, such as malloc, can induce non-deterministic behavior in an executable.
-Matt
I wonder, since Bell's theorem suggests that even if we account for all the local hidden variables, we still cannot account for the non-determinism seen in quantum mechanics. Therefore, Heisenberg's Uncertainty Principle seems to say that, yes, despite what makes sense to me, the universe does have non-deterministic properties. So, I suppose we could bubble-this up to the compiler world and memory management (this reduction might be grossly wrong) and suggest that non-determinism really does exist if a memory management routine, such as malloc, can induce non-deterministic behavior in an executable.
-Matt
Labels:
Bell,
Compiler,
Heisenberg,
Memory Management,
Non-determinism,
Universe
Friday, August 12, 2011
On Self Organization and Elevators
Major scientific breakthrough. I was just on the elevator with a bunch of technical (computer science, possibly engineering, students). It seemed as if we all optimized the space in the elevator as we entered it, as we all line up around the border of the car. I wonder, do people from other disciplines happen to organize themselves differently in elevators? Do particle physicists happen to distribute themselves evenly throughout the elevator car?
-Matt
-Matt
Wednesday, August 10, 2011
On Compilers, Insanity, and Global Variables
According to Einstein, the definition of insanity goes something like the following: "doing the same thing over and over again and expecting different results." Compilers should always produce the same code given the same input. However, I have noticed something now as well as before. I rely on gcc's predicate function for telling my analysis if a variable I am looking at is "global" or not. I have started to notice that I can run the compile once and the variable is not global (which is how the source code defines it). Although I can run the same compile again, and the variable is global?! The compiler is making me second-guess my sanity. How is this possible? Well, I know there is performance data that gcc can generate, which as I can imagine, aside from timestamps, is really the only non-deterministic event in gcc. I am not sure if this insanity-cause is from the front-end or from the middle-end. I have md5 summed my plugin and it is the same with both the "compiler thinks its global" and the "compiler thinks its not global" case. Hmmmmmmmmmm...
-Matt
-Matt
Sunday, August 7, 2011
Unification for Region Inference
So I figured it would be a reasonably good idea to write out about my research into region based memory management (RBMM). Mainly, this post is targeted towards explaining how my gcc-plugin finds objects that are candidates for region'd memory. But before I start getting to far ahead of myself now, let me quickly introduce RBMM.
RBMM is a method of memory management that allows for the automatic maintenance of a programs dynamic memory usage at runtime. Through the use of static analysis at compile time, calls can be emitted into the code of the program being compiled. At runtime, these region calls (e.g. create region, allocate memory from region, delete region, etc) are executed. Some might term this as a form of compile-time garbage collection, and that is not entirely incorrect, however; RBMM can be faster than garbage collection. Most garbage collection methods require the execution of the program to be suspended (extending the total runtime of the application) so that pointers to dynamically allocated data are no longer in use can be free'd. In contrast, regions place dynamically allocated data into chunks of memory which have an identical lifetime. Therefore, freeing the memory associated to a region is really fast. Just kill the region and boom, all memory in that region can be reclaimed. This is much faster than visiting each object individually, and does not require the need to halt the program's execution. Consider a linked list. We could put the whole list into memory, allowing us to free all the nodes at once, when it is no longer needed, as well as possibly enhancing cache locality.
The approach I am taking is to study RBMM from the prospective of the gcc compiler. My RBMM implementation uses a gcc-plugin which analyzes the program being compiled and then emits the code.
The plugin operates on two primary passes, the first pass of the program analyzes the source code and generates data from which the second pass, the transformation pass, later uses to emit the code into the program which executes the region calls (create/alloc/delete). The first piece of the analysis pass looks for variables that are pointers, which could possibly be used for accessing dynamic/heap memory. The gcc compiler provides us a generic intermediate representation which is languages c, c++, ada, etc) agnostic. This intermediate representation is called GIMPLE. Further, my passes operate (analyze and transform) the GIMPLE code which is presented in a static single assignment format (SSA).
The first pass visits each statement in the program. In the case of an assignment statement, we look at what value is being assigned to. If that value is a pointer, our analysis takes note of it. Any variables used on the right-handed side of the assignment statement, which are used for calculating the left-handed side value have some association. This association we call a unification. So the left-hand side is unified with the variables in the right-hand side. But, as mentioned, we only care about pointers, so only the right-hand side pointers are associated with the left-hand side. Similar goes for the case if the left-hand side is being set from a memory allocation call (malloc/new). For example, assume we are analyizing the following statement: foo = somepointer; In this case we would add both 'foo' and 'somepointer' as a single unification piece. In the case of complex data structures, such as 'foo.bar = somepointer;' Our analysis would unify 'foo' and 'bar' followed by unifying that piece (foo and bar) with 'somepointer'.
(pssst... hey... if you read this far, send me an email or leave a really lame comment)
In the case that our analysis comes across a function call (a callee), except for memory allocation calls as mentioned above, we have to defer the analysis of that statement until our analysis (first-pass) has completed. Once that analysis is complete, we have unification data which is specific to each function, which we use as we visit each function call statement that we previously deferred. This portion of the analysis now becomes interprocedural, as it requires knowledge of other functions in the program which are being called. In other words, after we have intraprocedural data for each function, we can use that information interprocedurally to analyze function calls. If our analysis comes across a function call, we can query our stored analysis data and get the unification information about the callee (the thing being called). From this data we can tell if a function takes a pointer and if that function will need to have regions passed to it. Therefore, we can unify the arguments we
pass to the function being called, as well as any left-hand side values which are being set from the return value from the callee. This unification is based on the callee's data which we are guaranteed to have after the first pass has completed.
Once the unification process has completed, the transformation pass can rock. But ultimately that is a discussion for another post.
tl;dr Put things together
-Matt
RBMM is a method of memory management that allows for the automatic maintenance of a programs dynamic memory usage at runtime. Through the use of static analysis at compile time, calls can be emitted into the code of the program being compiled. At runtime, these region calls (e.g. create region, allocate memory from region, delete region, etc) are executed. Some might term this as a form of compile-time garbage collection, and that is not entirely incorrect, however; RBMM can be faster than garbage collection. Most garbage collection methods require the execution of the program to be suspended (extending the total runtime of the application) so that pointers to dynamically allocated data are no longer in use can be free'd. In contrast, regions place dynamically allocated data into chunks of memory which have an identical lifetime. Therefore, freeing the memory associated to a region is really fast. Just kill the region and boom, all memory in that region can be reclaimed. This is much faster than visiting each object individually, and does not require the need to halt the program's execution. Consider a linked list. We could put the whole list into memory, allowing us to free all the nodes at once, when it is no longer needed, as well as possibly enhancing cache locality.
The approach I am taking is to study RBMM from the prospective of the gcc compiler. My RBMM implementation uses a gcc-plugin which analyzes the program being compiled and then emits the code.
The plugin operates on two primary passes, the first pass of the program analyzes the source code and generates data from which the second pass, the transformation pass, later uses to emit the code into the program which executes the region calls (create/alloc/delete). The first piece of the analysis pass looks for variables that are pointers, which could possibly be used for accessing dynamic/heap memory. The gcc compiler provides us a generic intermediate representation which is languages c, c++, ada, etc) agnostic. This intermediate representation is called GIMPLE. Further, my passes operate (analyze and transform) the GIMPLE code which is presented in a static single assignment format (SSA).
The first pass visits each statement in the program. In the case of an assignment statement, we look at what value is being assigned to. If that value is a pointer, our analysis takes note of it. Any variables used on the right-handed side of the assignment statement, which are used for calculating the left-handed side value have some association. This association we call a unification. So the left-hand side is unified with the variables in the right-hand side. But, as mentioned, we only care about pointers, so only the right-hand side pointers are associated with the left-hand side. Similar goes for the case if the left-hand side is being set from a memory allocation call (malloc/new). For example, assume we are analyizing the following statement: foo = somepointer; In this case we would add both 'foo' and 'somepointer' as a single unification piece. In the case of complex data structures, such as 'foo.bar = somepointer;' Our analysis would unify 'foo' and 'bar' followed by unifying that piece (foo and bar) with 'somepointer'.
(pssst... hey... if you read this far, send me an email or leave a really lame comment)
In the case that our analysis comes across a function call (a callee), except for memory allocation calls as mentioned above, we have to defer the analysis of that statement until our analysis (first-pass) has completed. Once that analysis is complete, we have unification data which is specific to each function, which we use as we visit each function call statement that we previously deferred. This portion of the analysis now becomes interprocedural, as it requires knowledge of other functions in the program which are being called. In other words, after we have intraprocedural data for each function, we can use that information interprocedurally to analyze function calls. If our analysis comes across a function call, we can query our stored analysis data and get the unification information about the callee (the thing being called). From this data we can tell if a function takes a pointer and if that function will need to have regions passed to it. Therefore, we can unify the arguments we
pass to the function being called, as well as any left-hand side values which are being set from the return value from the callee. This unification is based on the callee's data which we are guaranteed to have after the first pass has completed.
Once the unification process has completed, the transformation pass can rock. But ultimately that is a discussion for another post.
tl;dr Put things together
-Matt
Thursday, August 4, 2011
Milestone, 666 of Them!
So today I hit 666 scrobbles/hits/plays of Slayer on my last.fm account! This is a really neat number, although not prime, it is the sum of the first 7 primes [1], and 7 is a prime too. (This also happens to be my 66 post on this blog).
Sources:
[1] http://en.wikipedia.org/wiki/666_(number)
EDIT(6Aug2011): Whoops! I made a mistake, 666 is the sum of the first seven prime SQUARES. Not the first seven primes. Sorry!
-Matt
Sources:
[1] http://en.wikipedia.org/wiki/666_(number)
EDIT(6Aug2011): Whoops! I made a mistake, 666 is the sum of the first seven prime SQUARES. Not the first seven primes. Sorry!
-Matt
Subscribe to:
Posts (Atom)