No matter what fancy allocator you come up with, memory allocation will always be expensive. In order to reduce memory allocations I have been using stack-allocated containers for the past four-five years and I think it has worked really well, so I thought I’d write a post about them here. These methods have been used in all our games (Sprinkle, Granny Smith, Smash Hit as well as our new project).

Using containers is really convenient and often necessary. Take the array for example:

MyArray<Object> array;

Everybody’s got one. The problem with these are that you need to allocate memory when putting stuff in them. In many cases you know on beforehand approximately how many objects you want to put in there, reducing it to a single allocation:

MyArray<Object> array;
array.reserve(50);

But if this code is in a function that is called frequently it would be much nicer if those first fifty objects were reserved on the stack instead of the heap, like this:

MyArray<Object, 50> array;

Now, the array object would have built-in storage for fifty objects and still be able to grow beyond that using regular heap allocations. Great! Or? What if you want to pass such an array by refence to a method? Say, a string splitting method:

void splitString(const String& del, MyArray<Object, 50>& result);

The obviously problem is that the output array now has to be stack allocated for exactly fifty objects. If we try to pass another array it won’t compile, because C++ requires arguments with matching template arguments:

MyArray<Object, 8> result;
String str="this is a test";
str.split(" ", result);

Compiler error!

This effectively kills the entire charm of using template arguments for the array size. So how can we design a container class that can be passed around by reference while still offering stack allocation with a template argument?

What I’ve done is to create a base class, without stack allocation that can be used as a regular array:

template<class T>
class MyArray
{
    ...
    void* mData;
};

And then a subclass for the stack-allocated version:

template<class T, int size>
class MyArrayStack : public MyArray
{
    MyArrayStack()
    {
        mData = mStackStorage;
    }
    ...
    char mStackStorage[size*sizeof(T)];
};

The problem now becomes resizing the array at the base class, because the base class won’t know wether the data storage is heap or stack allocated, and we don’t want to have virtual methods and a vtable for such a lightweight class. Therefore, the base class needs to be aware of the subclass and avoid freeing memory when it is stack-allocated.

Fortunately this is not very complicated, since the stack allocation will always be placed immediately after the array object itself in memory, so we can just look at the pointer when resizing. If the storage pointer is right after the object itself in memory we have a stack allocation:

void resize(int newSize)
{
    ... allocate heap memory and copy over contents from mData
    if (mData != ((char*)this)+sizeof(MyArray))
        free(mData);   
}

Now we can modify the string splitting method to accept the base class:

MyArrayStack<String, 8> result;
str.splitString(" ", result);

or

MyArrayStack<String, 128> result;
str.splitString(" ", result);

Compiler happy!

So the next big question would be – is this safe? There is still a small (well microscopic) chance that a non-stack allocated array object lines up right at the end of the last stack frame, and at the next memory adress is our heap-allocated storage data for it. In practice though this won’t happen, because in standard memory allocators there is a header before the allocated data. Even if you use your own allocator without a header, this won’t happen, because the memory area it is using would still need to be allocated by the OS, which will put a header in front of it.

I’m currently using this for arrays, sets, hash tables as well as memory streams, memory buffers and a few other things. It has been incredibly useful and probably one of the most important optimizations I have ever added to my framework. This effectively avoids memory allocations altogether without the hassle of using the typical “maxSize” for method arguments. I’m currently not using it for string objects (which all have a fixed stack allocated storage that depends on the project), but I’m kind of tempted to refactor that.