Thursday, October 25, 2007

Playtesting is "Sovereign"

I've been known to say about game design that "Prototypes are sovereign", that you haven't really designed a game until you have a playable prototype. That's because, until the game is played, you just cannot really know what you've got. But I would be just as right to say "playtesting is sovereign".

When you design a game, you try to see in your "mind's eye" how the game is going to work, but until you play it, you simply cannot know what is going to work and what is not. The first few times you play, many things will change (provided, of course, that you're willing to make changes, which is a major requirement of a game designer).

Granted more experienced designers can foresee weaknesses and eliminate them before reaching the prototype stage. But we're interested here in teaching game design, so this is addressed to inexperienced designers.

Let's clarify something right now. I am talking about playtesting to improve gameplay, not testing to squash programming bugs. The latter is what is often meant by "testing" when people talk about electronic games, and this testing takes place late in the development cycle, when the gameplay and appearance are set in stone (because it's too late to make major changes). This bug testing ("Quality Assurance") is aimed at making sure the game works the way it is supposed to, not at whether the way it's supposed to work is good or not. "Bug testing" essentially does not exist in non-electronic games, although it is important (and often forgotten) to test the production version of a game, as converting the prototype into the published version can introduce its own set of problems. (For example, the boxes on Population Track on the FFG Britannia board are really too small for the purpose; this new version of the board evidently was not actually tested.)

So: here I'm talking about playtesting the gameplay and assorted details (such as user interface) that strongly affect gameplay.

There are three stages to playtesting: solo playtesting (also called "alpha"), local playtesting ("beta"), and "blind" playtesting (also part of the "beta" stage). (In electronic games, often the in-house testing is all called "alpha", and outside testing is called "beta".)

Few non-video games are meant to be played alone. Yet in solo playtesting, the designer plays the game solitaire, playing all the sides independently as best he can. At this stage the designer is trying to get the game to a state where other playtesters have a good likelihood of enjoying it, and of playing it through to the end. At solo stage the designer might try a portion of the game and then stop because something isn't working, or because he has a better idea. When asking other people to play a game I would never stop a game in the middle, or try something that might be so bad I'd want to stop, though I know of designers who think nothing of doing this.

Most video games can be played alone, and if there's a more-than-one-player component, it's usually impossible for the designer to play several sides by himself.

At the local playtesting stage, people are asked to play the game through, usually in the presence of the designer when it is a non-electronic game. Almost always, at the beginning of this beta testing I do not have a full set of rules, I just have notes about how to play, and some of the details are in my head. (This is a big reason why it is much quicker to design a non-electronic game. With an electronic game all the "rules" must be settled precisely before the programming of the prototype can be completed. The programming is the equivalent of the rules of the non-electronic game.) As local playtesting goes on, I make a rough set of rules, then finally write a full set of rules.

As the local playtests occur, I write down notes about what I see and hear, and especially about answers to questions that need to be incorporated into the full rules. By the time I have a full set of rules, I usually refer to the rules for detailed questions, to see if the rules cover that question and whether it is easy to find that information.

The third stage is "blind" testing, where someone is given the game and must play it without any intervention from the designer. This is a test of the rules, somewhat akin to "bug testing". Are the rules clear enough that people can play the game from the rules? What questions do the blind testers come up with, and how can the rules be improved as a result? Unfortunately, nowadays people are often poor rules-readers, so I advocate electronic tutorials to help people learn how to play a game.

I know from experience with published games, especially Britannia, that there will ALWAYS be people who misread rules, sometimes willfully. 99% clarity of detail is about the best you can get using the English language.

In a sense, electronic games can jump to "blind" testing quickly, because by their very nature these games hide the rules from the players, enforcing them through the programming. This is an advantage of electronic games over non-electronic, that no one needs to read and understand a set of rules.

Game design, when taken to completion, is highly interactive. Playtesting sets good games apart from bad, and playtesting is (or should be) interactive. In a separate post I list some of the things you must look for while doing beta testing.

There is no doubt that the last 20% of refinement of a game takes 80% of the designer's time. Playtesting is time-consuming, tweaking rules is time-consuming. In the non-electronic world, often a "developer", another person, does much of this testing and tweaking. I personally strongly prefer to do this myself, even though it is much less fun than creating new games, because I don't want someone else "screwing up" my game. (See for some of my experiences.)

Even when you don't intend to change the rules, rewriting them introduces unintended consequences (as evidenced by the Britannia Second Edition rules rewrite by FFG--and apparently having no testing of the new version of the rules compounded the problem). When you rewrite to change a rule, the repercussions are often larger. So a remarkable amount of testing is needed.

In the electronic world it is difficult to quickly and cheaply make big changes in a prototype. This is one of the problems that all makers of electronic games face, and a major reason why some electronic games are not very good. By the time the development studio has a playable prototype, it is too late in the schedule to make the changes that playtesting reveals are necessary.

At some point during playtesting of a game, the designer must decide if "there's something in it" (as I put it): if the game is really good enough that people might play it, like it, and would buy the finished version of it. There's really two times when this should happen, once during solo playtests (alpha testing), the second time during playtesting by others (beta testing). The "something in it" point in solo playtesting is an indicator that it's about ready for others to play. The "something in it" point in beta testing comes when observing people playing the game and their reactions during and after playing.

Usually I need to tweak a game quite a bit from its state at the end of solo play, before I can reach the "something in it" stage of beta testing. Sometimes there doesn't seem to be anything in it during beta testing, and I set it aside for further thought. Sometimes I realize, from solo playing, that there isn't "something in it", at least not yet, so I set it aside at that point.

I strongly suspect that novice designers rarely understand these stages. Their egos become involved, and they assume that because they took the time to make the game, and it's their idea, there must be something in it. In extreme cases, the "designer" thinks he has "something in it" when all he has is an idea, that is, when he has virtually nothing at all. The number of people who think they've successfully designed a game, yet haven't playtested it at all, is remarkable. Playtesting is the meat of successful design, not the end. (I confess that I don't think of "development" as a process separate from design.)

So how do you recognize when there's "something in" a game? That's hard to say, unfortunately. Surveys or written feedback won't necessarily reveal it. In alpha testing, the "something in it" stage is a gradual realization, coming from observing my own thought processes as I play. My games are, almost without exception, strategy games. When I "see" myself thinking hard about the strategies, and liking the options, then I may think there's something in it.

In my case, in beta testing when spontaneously (without any urging) people say "I'd buy this game", I know I've got something. However, this is rare, and I don't remember anyone ever saying that about Britannia, or Dragon Rage, or Valley of the Four Winds, but they have all been quite popular. Perhaps better, if people want to play the game again, in this day of the "cult of the new" when hardly anyone plays a game twice in the same session, there may be "something in it".

I am very low-key in beta playtesting, preferring to watch reactions of people rather than try to solicit opinions, in part because people (being polite for the most part) won't say negative things even when asked. I also try not to play, as 1) the designer playing in a game tends to skew results and 2) when I play, I do a worse job of playing, and a worse job of evaluating the playtesting, than if I did either alone. As I'm that strange sort of person who enjoys watching games as much as playing, why play?

I do not "inflict" a game on players until I think it is good enough to be OK to play, that is, I've reached that first "something in it" stage. Evidently some other designers playtest with other people very early: not me. My playtesters play games to have fun, not as on obligation, and most are not hard-core boardgamers, so I do what I can to make sure the game MIGHT be fun before I ask them to play.

As I said, playtesters tend to be polite. It's hard to find out what they really think. I am skeptical that a feedback sheet will make a difference. Rather,
I sometimes try the "Six Hats" method (devised by Edward de Bono) when playtesting; specifically I'll ask players successively to put on their black hat (the judge), then the red hat (intuition and emotion) to see how they assess a game, and then the yellow hat (the positive side of assessing an idea) to see what they like about a game. With local playtesters I sometimes ask them to think of ways to make the game better (the green hat). Google "de Bono" or "Six Hats" for more information.

Tuesday, October 23, 2007

Things to watch for when playtesting

I'm repeating this from my teachgamedesign blog. It was done in haste, so I'll probably think of additional items.

Length. A game is always longer to new players, of course. But if it takes too long for new players, will they play again? Length is of course quite dependent on how much players enjoy what is happening in the game. The boardgame Civilization can take 8 to 12 hours, but those who love the game don't find that time weighs upon them.

Down time. Downtime is the time people must wait while someone else is taking a turn. This can be a problem even in a turn-based electronic game. Do people get bored waiting for their turn?

Is the game balanced. Even if the game is symmetric (all players start with identical situations), is there an advantage to playing first (or last). Chess is symmetric except for who moves first, but move-first is a big advantage.

Dominant Strategy. Look for any dominant strategy ("saddle point"). This is a strategy that is so good that a player who wants to win must pursue it; or a strategy so good that some will pursue it, yet that strategy renders the game less than entertaining. For example, in a Euro-style 4X game I've designed, one player found that by getting together a sufficiently large force, along with certain technology research, he could completely dominate other players who weren't pursuing the identical strategy. I want the game to offer a variety of ways to success, so I had to change the rules fairly extensively. This is why it is very important to have testers who are dynamite game players, so that they'll find these strategies during testing, rather than have someone find it after the game is published. I'm luck that I have one such player, and that I can be such a player myself when I put my mind to it.

Analysis paralysis. Are there too many things to watch for or keep track of, or too many choices, so players either freeze up or give up on figuring out what is the best thing to do? There are always "deliberate" (slow) players, the question is, is everyone slow or frustrated?

Rules difficult to grasp. What do the players find hard to grasp. (In my prototype Age of Exploration, players had trouble grasping the difference between movement of units and placement of units. I used the same distinction in an abstract stones-and-hexes prototype, and no one has a problem. Even if, after playing, players "get it", it might be necessary to change something. (In AoE I changed the rules extensively to recast/eliminate the distinction.)

What do players tend to forget? This isn't quite the same thing as what's difficult to grasp. Some rules just don't stick in people's minds. Is there anything you can do about it? Is there some play aid to help people remember?

What do players not bother to use? Some rules exist but no one uses them. If the threat of using them is not making a difference in the game, then perhaps you should eliminate the option. For example, in my hex-and-stones game Law and Chaos I originally allowed people to move a piece rather than place one. This happened rarely, as it was usually better to place another piece and increase the number on the board. So I eliminated the possibility, except as an "optional rule".

Here are some items added from comments on boardgamegeek:

Adequate control. Do the players feel that they can exert a measure of control over what happens in the game? Remember, any (strategic) game is a series of challenges and actions in response to those challenges. (Harmony)

Horns of a Dilemma. On the other hand, are there enough plausible decisions in a play to make the players think, but not so many that "analysis paralysis" sets in. Even in a simple game, if a player can do only two of five possible actions in a turn, is there tension here or are the plays obvious? As one commenter put it, do the players sometimes feel "so much to do, so few actions"?

Player interaction. Do the players have to take the plays of other players into account? Yes, some games are virtually multi-player solitaire, and some players are happy with this. But most players want to be able to affect other players with their moves.

Taking it to the Max. Can extreme behavior within the rules break the game? Sure, if someone pursues a bad strategy, they'll lose. The question is, is there some extreme strategy that results in an unfair game?

Components and Play Aids. Do the physical parts of the game help play flow smoothly, or does something need to be changed? Is there too much record-keeping? How can it all be simplified?

Stages of play. You probably learn this in alpha/solo testing, if you do solo testing (which I strongly recommend). Are there identifiable stages in the game, especially ones where the typical run of play changes? E.g., in chess there is the early, middle, and end games. Pieces are deployed in the opening, mix it up in the midgame, and so forth. An exploration game has the expansion period followed by consolidation and then (usually) conflict. Etc.

Player interest/"fun". What part(s) of the game seem to be most interesting to the players? I'm not in favor of trying to figure out "fun", because fun comes from the people who are playing more than from the game design itself. And there are many games that I wouldn't call "fun" (including Britannia) that are nonetheless interesting and even fascinating.

Finally, remember Antoine de Saint-Exup'ery's maxim: “A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.”