Here's a hopefully illustrative example of a special purpose memory manager: The...

Here's a hopefully illustrative example of a special purpose memory manager: The naive bitmap

Take a fixed-size pool of memory and divide it into a fixed number of equally-sized regions. Say, if your game's entities take up a maximum of 32 bytes of memory each, and you are ok with imposing a hard limit of 256 entities in the world at once, you could allocate an 8 kilobyte memory buffer and manage it yourself. Already there are apparent benefits in this scheme in reducing memory fragmentation (all of the game entities are in this 8 kilobyte region, and not scattered all throughout memory) and thus cache coherency (it's quicker to sequentially access memory in a small region of memory than it is to "hop around" all over the place).

How might we manage this memory? We really just need a way of allocating regions ("I want to create a new entity, please give me a free region to store it in") and freeing them when we're done with them (enemy died or whatnot). Because we have a fixed number of fixed-size regions, it's very straightforward (well under a 100 lines of C) to use a bitmap for doing so. Well, bitmap is the common terminology for this data structure, but I prefer to think of it as a binary set: you have a bunch of bits in RAM, and a 1 bit at some integer index indicates that the index is in the set, and a 0 bit indicates that it isn't. For example, if you have the byte 00010011, then 0, 1, and 4 would be in the set, while 2, 3, 5, 6, and 7 would not. [EDIT] Er, I guess "set of integers under some limit" is a more accurate description, but I like the way that "binary set" sounds.

So, dividing the 256 regions of the example by 32 bits in an x86 machine word, you would need 8 machine words to store this binary set, which take up 32 bytes total. All we need to keep track of is whether or not each region is free, so it's a suitable data structure for this purpose.

Most modern machine architectures have instructions for finding the index of the first set (1) bit in a machine word in a single, very quick operation.[1] So, we can use that to our advantage by representing a free region as a set bit, and an allocated region as a clear bit. Then, when you need to allocate a new entity, we can very quickly iterate over the bitmap like so: for each machine word, skip it if it is equal to 0 (all bits are clear, ie the region is fully allocated anyway), and otherwise call find-first-set to find the index of the next free region. Clear that bit (to indicate that it is now in use), and return the index of that bit to the caller (who can then use it like an index into an array, or convert it to a pointer). Freeing regions is even simpler: You just set the bit in the bitmap corresponding to the index in the memory buffer.

We also get another very useful operation for free with this scheme: Iterating over every object in the set. Say, we want to call "update" on every entity in our buffer. Make a copy of the binary set, and invert it, so that now a set bit indicates an allocated region. Then, it's very simple to create an iterator that can traverse every entity in existence: Call find-first-set to find allocated entities, call their update function, then clear the associated bit in the inverted binary set. Keep doing so until you reach the end of the set.

That's a very simple scheme to implement, but you can only really use it when it fits your memory allocation patterns well, or when you can tailor your memory allocation patterns to fit the memory management scheme. In a game, it can be acceptable to impose a hard limit on, say, the number of entities on screen (you could easily have a rule somewhere that stops spawning enemies if a certain limit has been reached, or start despawning less important entities when memory gets tight), but this probably wouldn't fly if you were maintaining a data structure for handling website requests: "Sorry, this site only supports up to 1K concurrent users, please come back later"

Now, think about the kind of changes you might have to make to make this manager more "flexible": How could you make it handle regions of various sizes? How could you make it handle growing the memory buffer (so that you can create more than the "maximum" number of entities)? How would you make it handle various types of objects that just happen to take the same amount of space? How about handling extremely large swaths of memory? Can we reasonably predict how the memory manager will be used in the program, or is that entirely out of our hands? The closer you get to malloc's use case, the less useful this data structure becomes, and by the same token, you should be starting to see how a manager that is expected to perform as best as possible under arbitrary usage patterns can be sub-optimal for a specialized use case.

Lastly, it's perfectly acceptable to use different memory management techniques at different levels of memory granularity. Consider that that is essentially what you'd be doing if you malloc a buffer that you then manage with the bitmapped memory allocator I described.

[1] https://en.wikipedia.org/wiki/Find_first_set