Pages

Saturday, August 31, 2013

Cheers Mate

Man! The past three years (a little more) have been amazing. I have met some of the brightest and nicest people on the planet. So, I am moving back to the US. In fact, I sit here now in the Auckland, New Zeeeland airport clicking this out. It's sad to leave, but I can't want to turn on a new adventure. Matt 3.0.

I left with a nice bang though! Last night was the monthly Ruxmonsecurity/hacker talks in Melbourne, and I delivered a talk: "An Organized Talk on Disorder: Defeating Address Randomization to Make the Universe Boring." I must say that it was one of the best talks I have ever delivered. I talked about a technique I discovered to defeat address space layout randomization. Anyways, I have not publicly published anything on this... yet.

I love Melbourne, Ruxmon, and having made some of the best friends, I will miss this/that place. But In three years, I only saw two kangaroos in the wild, no snakes, no drop bears, one big-ass monster spider, and killed a single jar of Vegemite. Good on me mate.

-Matt

Wednesday, August 14, 2013

Non-deterministic Trash Pick-up, or Why a Garbage Collector Might Influence Itself

Determinism is a sexxy property for software verification. Given a known input a deterministic program should produce the same output, consistently. This makes testing trivial, simply match the program's output with what has been previously been determined as the correct output. If the two outputs match, *bang bang* we passed output verification and the PHB is pleased. This is, of course, assuming that the program is void of any side effects that would cause some bit of non-pure/non-deterministic output. For a deterministic program, any unexpected output is a clear sign of a bug. For other types of programs, non-deterministic output can result from the program relying on clock values, random values, or other input which can be hard to replicate in testing (such as mouse or other device interrupts). However, a proper verification setup should have a way to statically set such variables so that the program can have a proper output testing. For instance, seeding the random number generator used in the program to a known value during testing.

I was recently surprised to find one of my memory management test cases producing non-deterministic output, even though the program is trivial and relies on little or few external libraries, and no obvious random or non-deterministic input. Actually, I discovered this a while back, but recently revisited it. Simply, I was gathering data for my thesis work, region-based memory management (RBMM), and found that the garbage collector I was comparing my RBMM system against kept producing different collection counts (the number of times the collector executed during my test case's run). See, both garbage collection and RBMM are forms of automatic memory management and I wanted to see how well my RBMM system compares to the more common garbage collection. My research is based on implementing an RBMM system in Go and then comparing my work against the stable garbage collector that Go ships with.

Anyways, I was able to replicate the effect I had previously witnessed in a simpler test (below) using the Go runtime provided by gcc-4.8.1. Remember that Go is garbage collected, thus its garbage collector was producing different collection counts (NumGC below) for each execution of the test.

package main
import "runtime"

func main() {
    for i:=0; i<10000000; i++ {
        f := new(float64)
        *f++
    }

    var m runtime.MemStats
    runtime.ReadMemStats(&m)
    println("NumGC: ", m.NumGC)
}

Why would Go's collector be producing non-deterministic collection counts? This program obviously does the same thing every execution, and generates the same amount of garbage each execution. Well, a couple things could be happening. Perhaps Go's collector was looking at overall system resources, and not just for the test case, and deciding when to collect. Or, perhaps it uses a random value which might affect collection. As it turns out, the latter was the case. Go, by default, enables some statistics/profiling information during runtime. What tipped me off was that I ran a grep for "rand" within the runtime source and I found the following in its memory allocation function "runtime_mallocgc" located in gcc-4.8.1/libgo/runtime/malloc.goc. Note that I modified the whitespace some:

if(!(flag & FlagNoProfiling) && (rate = runtime_MemProfileRate) > 0) {
    if(size >= (uint32) rate)
        goto profile;
    if((uint32) m->mcache->next_sample > size)
 m->mcache->next_sample -= size;
    else {
        // pick next profile time
 // If you change this, also change allocmcache.
 if(rate > 0x3fffffff) // make 2*rate not overflow
     rate = 0x3fffffff;
 m->mcache->next_sample = runtime_fastrand1() % (2*rate);
 profile:
     runtime_setblockspecial(v, true);
     runtime_MProf_Malloc(v, size);
    }
}

So their profiler is having an affect on the amount of memory pressure on the garbage collector, thus causing the collector to kick-in more often than a non-profiled runtime. So, what does this mean? Well, if you do not need any profiling support (probably not if you are running production) then disable the profiler! What I did, was just that, and then executed my simple test multiple times. Bingo! With the profiler disabled I was able to achieve consistent predictable garbage collection counts from Go's collector. To disable the profiler, simply add this line to your main()

    runtime.MemProfileRate = 0
The latter flag will set the profiling rate to zero, thus the runtime will not collect allocation metadata. Otherwise, this collection of metadata will influence the garbage collector, and that probably is not wanted if you run a production executable (or are performing some verification/testing).

-Matt