Friday, July 25, 2014

Testing Games

How do you go about testing a game?  What makes a great game tester?  What is different about trying to improve the quality of a game compared to other software?  How can game studios large and small ship titles that turn out to have serious bugs?  Don't they test properly?


These are serious questions both for game developers at the indie level like me, and for larger organisations; but also for the game playing public spending their good coin on titles they expect to actually work.  The screenshot above showing a main character model collapsing under what looks like broken physics/ragdoll is from a video posted by dubesor on the Steam Community site, who found a number of pretty hilarious and also quite serious bugs in the game "Cognition" by Phoenix Online Studios.  I recommend checking out dubesor's post for the video.

I played Cognition on iPad and as soon as I started playing I ran into bugs - Erica's gun turned into a big black square, behaviour consistent with a missing sprite for a billboard, right at the beginning of the game.  Cognition was a great game with some innovative ideas, and beautiful realisations in artwork, animation and dialog.  Yet I found bugs, and others found different bugs, so how can a studio release a game with problems that are so obvious and widespread?

I'm going to take a stab at these questions and hopefully say something useful about game testing.  But first I want to say that Software QA is an enormous area, with hundreds of great books already in print about the subject, so I can't possibly give any kind of general coverage of that topic in this brief rant.  If you seriously want to be a great QA Engineer there are some excellent books on the subject.  Michael Feathers book "Working Effectively with Legacy Code" about how to get testing into an already running project is great.

Even the more specific subject of Games QA is massive and fraught with differences of approach.  Instead what I want to do is raise three things to get us thinking more deeply about these questions, and point the way to the answers.

Developers Should Test What Works


Its a developers job to test what works, not every edge case.  Ask someone who is early in their career as a developer how to put on their QA hat and test the following function:

// Given the width and height of a sprite rect, return its area
int getArea(int width, int height);

They'll probably say (especially if they didn't write that code) something like this:

Oh, right - we need to test what happens when width is the smallest it can possibly be, and also the largest; and the same for height - yeah, and let's test all the combinations of those.  Right - that is a good start.
Fair enough, but that is not a "good start".  Where you must start in my opinion is by doing what I call "testing the man page".  In Unix-like systems you can take a function and call the command-line tool man (short for manual) to tell you what it is supposed to do.  So for example here is a screenshot of what the man page for fmaxf(x, y) looks like on my computer:


Pretty specific isn't it?  If a function does not do what its man pages advertise it as doing, then its failed.  So we first need to make sure it at least does what is written on the outside of the box.  If you have a bunch of fancy ideas about testing the limits of the data types passed in, or passing random data into it that is fine - but you have not done your job as a test engineer if you have not tested that the function does what it is supposed to do when passed in typical end-use values.

Well, there's no man page for our getArea function above, but there is a comment above it, which shows what the author intended it to do.  First off its pretty clear that the area is intended to model what a rect of a sprite can be:  and for that it really makes little sense for width and height to be negative values.  Also given that the rect is for a sprite, we can guess at typical values, eg 640x480 and so on.

What's more if you wrote a test that included passing in INT_MAX for both those values the result would probably overflow and produce garbage.  You might conclude that your testing had found a fatal flaw; but really unless you ever plan to have sprites of tens-of-thousands of pixels in size that is probably not the case, and your test is probably not valid.  For the professed use case of the function int is probably fine - maybe uint would have been better - so the limits of the data type are not really a problem.

The thing about functions like this is that often there is fancy caching and optimisations, and various other things going on under the hood, which we should not need to know about - all we need to know is that the function does what is advertised.  My point here is that the man page, the box description, that should be your starting point, and the other stuff is typically not worth the time up-front, and sometimes even no use at all.

As developers we understand this at a pretty fundamental level, and thus when we have to test our own stuff, we usually start with testing that we have at least done our job and what we produced performs when asked to fulfil its brief.  Should we be testing a lot of corner cases and boundary values?  Are developers falling down on the job when they don't try hard to break what they have built?  No, because exhaustive suite based testing provided by complete unit testing, system testing and integration practices is a different endeavour from software development.

If you ask a developer "have you done X" where X is some bit of software functionality the answer you get will reflect whether they have written the code and tried it out on its core use case.  Doing full QA on X is a very different thing from "developer testing" which is not part of QA at all really.  A good QA engineer starts with the core use cases and goes from there.

What this means for game playing public is that if you find a bug on your particular platform, operating system release and engine version that seems so incredibly obvious to you, the question to ask is not "how could the developers have released this!" it is something more like "how could the publishers not have paid for exhaustive QA which covered my configuration!"

I'll just say one more quick thing about this topic before moving on:  random testing is also not a good start.  In the refined air of security theory we have a practice known as "fuzzing" which involves blasting large amounts of random data at an application (typically a web-app with a HTTP front end) in the hopes of getting it to break.  There are also suites of security testing tools that employ fuzzing, in combination with very large data buffers, unicode characters and a range of other malicious data intended to break the application.  If an app can be broken, then likely it can be exploited and that is bad in something that has your credit card details or personal information.

This kind of "penetration testing" or "white hat attacks" are not where you start in games QA testing.  I'm not even sure if there really is a place for this near the end of QA testing, even in the most well resourced games publisher's QA house.  Games QA is about making sure that the central game play works when taken down the path that players experience on their regular hardware.


An Aside: Aren't Developers Supposed to Write Tests?


As a Software Engineer the answer to this is "Yes, always!".  And here, tests means:

  • unit tests
    • that is per function tests like when testing our getArea() function above
  • integration tests
    • designed to catch problems when we check in our code into the project
  • system tests
    • designed to test on real hardware with networks and inputs and so on
Doing these things properly is the only way to ensure critical software projects work properly.  Mobile operating systems, production software, commercial projects - you don't know if they're actually working unless you do these things.


But in games and apps, the project has no spec as such; there is no framework or system - we're writing user or player experience right on top of a UI layer.  Testing would involve taking screenshots to see if they "looked right".  If our development schedule and project budget has allowed time for it then sure - off we go and do those tests.  But often in games development at the layer close to the player, these types of developer testing are just not feasible.

In framework or library code, yes: developers write tests.  In production systems you must write unit tests as part of development and the budget has to be made to stretch to those tests.

But if you're writing an app or a game, or some other consumer software its pretty tough to justify the expense of writing unit tests to check a screen looks right, let alone setting up integration and system testing.

As a software engineer I would always write tests for every piece of software, but as an indie game developer writing code that gets thrown away 5 times before it reaches production writing tests is not justifiable.  Hence as developers of games we rely on QA testers largely for our process.


Software Testing is an Adventure Game not a Combat Game


I've worked with some great QA guys in my time as a software engineer.  The very best QA guys could work across an incredible array of devices with ease, and juggle multiple different versions of OS on those devices, installing our new updates carefully so as not to leave artifacts from previous installs that could corrupt the results, and then delivering timely accurate reports of exactly where I had screwed up in my work.

The QA teams job is a tough one, and at times its thankless.  But its also essential.  Here's how to think of QA: its like an adventure game, where you have to go into every room, find every object in every drawer, behind every picture that swings out, and under every trapdoor, before you can say you won the game.  In combat games you beat the boss with that final swing of your sword and you're done, even if you cut straight through the instance without doing any side quests.

Some QA guys I worked with when they started out they'd come out triumphantly with something they'd managed to break, perhaps in an interesting way like the bug in Cognition above, and wave it around proudly.  We'd give them a pat on the back, smile a wry grin and give them their moment.  The QA guys who got the most respect were the ones who produced a large list of issues big and small, from all parts of the projects functionality - and who found ways to reach and test that functionality even when weird, or distracting bugs made it difficult to reach.

So if you're recruiting for great QA engineers, just remember this: you want adventure quest players with an eye for the long game, not the boss killers who think its time for a big-noting break whenever they kill a boss.

QA is Hard(TM) but Every Bit of QA Done Makes Things Better


Finally my third and last point: don't be discouraged.  If you're a player and you still see buggy games; or if you're a developer or studio producer and it feels like you'll never make the game perfect; remember that the perfect can be the enemy of the good.  Delivering and publishing are essential and there's a very real truth to the idea that the best software is the software that got delivered, when compared to the theoretical amazing software that didn't.

I hear student developers and those working on their vanity projects say 
But what is the point?  No testing is going to find every bug!  You can test yourself blue in the face and still miss bugs!  What good are QA procedures if its not going to stop the problems its supposed to fix?
Well, here's some tough news to hear:  nothing is going to stop all your bugs.  If you are writing software (and games are software) then you are writing bugs.  The only way to make no bugs is to make no software.
Why have laws against murder if people keep murdering?  Why have democracy and voting if corrupt politicians keep getting in and feathering their own nest?  Why do exercise and eat healthy if we're just going to die anyway?
The answer is in the title of this section:  we do what we can to make it better.  Every incremental improvement trends us closer to where we want to be.  That is what QA is about, that is what being a tester is about.  You don't get to kill the boss, you don't get to save the world, you just get to make things better than they might otherwise have been.

If great testers and awesome QA procedures mean that you ship your game and the number of players experiencing awful bugs is one less; then it was worth something.  That one player gets a great experience from your game, and played it how it was meant to be played.