While working on Catacombs Plus, I was encountering a weird issue where, when you started a new run after ending one, the runner would just crash with no error message.

The bug in question.

The best thing to do in this situation, really, was submit a bug report… unfortunately, I don’t want to send my entire project over, and I didn’t know the cause of the bug to make a minimum reproducible example. So, I decided to try and figure it out myself.

My first thought was to throw it into the GameMaker debugger. While I was suspect this was a bug in the runner’s C++ causing the game to go boom, it still might be worth testing if the runner has more checks or error handling in debug mode. Turns out that not only does the debugger not affect error handling, but it’s actually controlled by the gml_release_mode pragma, which I don’t use (so all error handling is on).

However, the run log gave us some hints…

Enacting reset hack
Going to proc room
Game controller created

C:\ProgramData/GameMakerStudio2/Cache/runtimes\runtime-2023.8.2.152/windows/x64/Runner.exe exited with non-zero status (-1073741819)
elapsed time 00:00:05.6052134s for command "C:\ProgramData/GameMakerStudio2/Cache/runtimes\runtime-2023.8.2.152/bin/igor/windows/x64/Igor.exe" -j=20  -options="C:\Users\Sean\AppData\Local\GameMakerStudio2\GMS2TEMP\build.bff" -v -- Windows Run started at 11/03/2023 01:55:41
FAILED: Run Program Complete
For the details of why this build failed, please review the whole log above and also see your Compile Errors window.

So, to make sense of this log, I have to explain how things work. GameMaker uses rooms, which are basically levels. Rooms can be persistent, meaning the objects and state of the room will remain even when the game switches to another room. Catacombs has two main rooms, the menu room and the main game room. The main room is persistent, which facilitates being able to pause the game, as objects in a persistent room will not update while in another room.

When resetting the run, I want to reset everything in the main room. The easiest way to do this is to make the room no longer persistent, and then re-load the room. There are two problems with this, however: you can only change a room’s persistence while in that room, but you can’t switch to the room you’re already in. To work around this I implemented a hack: when you reset your run, it sets a flag and sends you back from the menu room to the main room. The game controller then notices this flag, unsets the room persistence, then sends you to a different room, whose sole purpose is to send you back to the main room.

So, that explains the debug messages I left. But there’s another clue in that exit code. -1073741819 has the hexidecimal form 0xC0000005, which is the Windows error code for an access violation, also known as a segmentation fault. Basically, the application is trying to access invalid memory. As this is happening during a reset, it’s likely trying to access an object that no longer exists.

But how could this be happening? If everything’s reset, why would the game be accessing things that don’t exist?

To figure this out we’re going to need a debugger that can understand the native code the runner is blowing up in. For this I’ll use WinDbg, Microsoft’s standalone debugger. Simply run the game and attach WinDbg to it, and we’ll get some juicy data when the game crashes.

The stack trace is usually the first place to look - it shows the functions in which the crash happened.

Ah. That’s not very useful. The Windows runner has no debug symbols, meaning there’s no mapping of a function’s address to its name. But not all hope is lost - the macOS runner does have these symbols, and macOS even shows us a full crash report, no debugger needed!

And it gives us another hint: the game is trying to compute the bounding box of an object, but it appears that object doesn’t exist, so it goes boom.

We’re close now, but we still don’t know what code is checking that bounding box. GameMaker code runs in a virtual machine, instead of native code, so the stack trace before then is just “the runner is running your code”. Not useful.

Fortunately, GameMaker has a feature called the YoYo Compiler, which translates GML into C++. This C++ gets compiled and linked against the rest of the runner code, producing a new executable file. With debugging symbols! Let’s throw it into WinDbg…

And we can see where in my code the crash begins: a function called lights_tick. So it’s to do with the lighting engine: lights_tick is a function that runs every frame to update the state of each light. Let’s take a look at that function…

function lights_tick() {
    struct_foreach(global.lights, function(i, light) {
        if (light.object != noone) {
            light.pos = [
                light.object.bbox_left + (light.object.bbox_right - light.object.bbox_left) / 2,
                light.object.bbox_top + (light.object.bbox_bottom - light.object.bbox_top) / 2
            ];
        }
    });
}

Lights are stored in a global variable: a struct mapping its ID to its data. Each light may have an object associated with it. If it does, lights_tick will set the position of that light to the centre of that object, calculated from the bounding box.

Now we have all the clues needed to figure out the crash. As I said, lights are stored in a global variable - and global variables always persist across rooms. The mistake is now fairly obvious: I was not resetting the lights variable between rooms. This causes the game to update lights from the previous run, causing it to reference objects that no longer exists, making it crash.

I fixed it by resetting the lights struct when the game controller was created. The longest diagnoses have the simplest solutions…

But it’s weird because I later realised that if I passed the object’s ID, instead of the object’s struct1, it would handle the error properly. I suspect it might be due to it being a reference to a struct contained in a struct? I attempted to make a minimal example project that does the same thing, but even there it was handled properly:

I’m not really sure what actually caused it to segfault like this, but at least I fixed it.

  1. GML is a weird language. Referring to an object (such as via self) gets you a struct containing that object’s fields. If you want an actual reference to the object itself, you need self.id. See this page in the manual. This trips me up constantly.