I had a rather interesting bug the other day.

On XBox, we're hooking and overriding everything we can in terms of memory allocation, both to get an overview of what is happening in the retrofitted-from-PC title, and to tweak and better control it. Since everything is statically linked, and there's support for this kind of stuff from the console, there's no need for nasty DLL injection hacks and other complicated stuff; you hook it, you start the app, and even before the first line of main executes you've seen tons of allocs fly by just from the CRT kicking itself up and running.
So we make a heap when we see the first request coming in, and direct some requests to that heap. The strange bug we saw was that data would be created, the app would continue booting for a while, and then suddenly the code decided to create the data again. Something like this:
struct heap_t
{
heap_t() : heap(0) {}
void create() { heap = make_xbox_heap(); }
void *heap;
};
heap_t the_heap;
void do_alloc()
{
if (!the_heap.heap)
the_heap.create(); // ***
... // alloc from the_heap
}
As said,
create() works fine,
heap != 0, it's all good... and then suddenly we start over and reach that line marked with *** again.
After some head scratching and going through the usual suspects (multithreading? out of bounds overwrite?), I thought of the C++ requirements for global objects. The spec says that these objects need to be constructed properly before execution first gets to any code in the module. There's no guarantee that all the global objects would, for example, be constructed (ie. have had their constructor called) before the first line of
main() is reached. There's also no guarantee about the order several global objects are constructed in; all objects in one single cpp file
will be constructed in top-to-bottom order, but the objects in
a.cpp may be created before those in
b.cpp, or after -- it's up to the compiler.
So... could it be that somehow the constructor got called twice? Is some other cpp file doing something weird with
extern heap_t the_heap?
Putting a breakpoint in the constructor reveals that it is called only once... and this
after we reached ***.
Ok, now the pieces fall together; what's happening is that we're so low in CRT-alloc-hook land, that we've managed to reach this code
before the compiler/CRT had a chance to go through the list of constructors to call for this file, properly initing
the_heap via
heap_t().
So... how come we didn't get crap in
the_heap.heap and nicely executed the *** line rather than carrying on with a corrupt heap handle? It's nicely zero, even in non-Debug builds...
Answer: because heap_t is a POD. Out with the spec again, a struct or class is a POD, or Plain Old Data(structure) if it has no virtuals, and all its members have a trivial constructor. In other words, a struct is a POD if it is so simple to construct, that the compiler can safely skip the constructor and just sort of "memset" the struct to zero and be done with it. And that's exactly what the spec demands:
heap_t is a POD, and so must be initialized with all zero.
So... did we just get very lucky (or unlucky) that the first alloc callback happened to come
after the memset, but
before the constructor call? In fact, why didn't the compiler skip the memset at all, seeing as the POD has a constructor?
My guess is that there's no memset at all, but rather the compiler just laid out all global PODs in a separate data section, and that data section is set up zero'd and all by the image loader.
So to sum up:
- the loader grabs zero'd memory from the OS for heap, the app starts, the first alloc comes in;
- we see a zero handle, and create a heap
- we service any further alloc requests
- at some point, flow reaches this file through regular (non-callback means)
- compiler sees this and makes sure heap_t() is called
- the handle is zero again, so we create one. Great, we just managed to leak an entire heap :)
God I love C++ :)
For a bedtime story, it's rather action-packed, don't you think? I know my kids won't sleep after reading them this.