Playtesting versus science
(Visited 7059 times)…there isn’t much resembling a science for designing the abstract game features, or at least not one that is well-known and accepted. Even some of the better-known designers such as Daniel Cook and Raph Koster seem to consider their work to be more about casting an enlightened eye over trial-and-error, relying on play-testers to tell them what is fun. While nobody would seriously argue that you don’t need some sort of play-testing – just like graphics programming requires the programmer to actually look at what is being rendered – it seems a bit defeatist to assume that it’s not theoretically possible for a knowledgeable enough designer to be able to create a compelling game experience without needing to have others try it first.
via The importance of abstraction « Tales from the Ebony Fortress.
I’ve certainly made games that were fun right off the bat. It’s an exhilarating experience when it happens — though arguably, I played them in my head before playing them in code or on paper, in my first prototype. But I have definitely gotten prototypes to fun before showing them to other people. In fact, I generally don’t show them to other people until I get them to some semblance of fun.
So sure, it’s possible, and we don’t need to be defeatist about it.
What I have never done is gotten them to be as fun as they can be without someone else’s eyes on them.
I suspect this isn’t any different from any other creative medium; writers need editors, theater needs rehearsals, etc. Workshopping and dry runs are classic tools used in the arts for centuries, regardless of how much we manage to turn art from craft into science.
18 Responses to “Playtesting versus science”
Sorry, the comment form is closed at this time.
I should probably clarify that I was referring to the various Twitter exchanges from the other day – unfortunately they are quite awkward to cite inline in a blog post!
No disagreement with anything you’ve said here.
Playtesting is science. It’s forming a hypothesis (this system will be fun), conducting an experiment, and measuring the results.
It seems like you’re lamenting that there isn’t yet a fully-developed and accepted theory of fun in game design? Which is logical given how new it is. We’re obviously making progress (Raph’s book). As we develop better ways to measure the results (better than sales and critics), the science of fun will continue to expand.
Ben, I figured you were but I couldn’t resist answering in more detail than what fit on Twitter either. 🙂
I do think we are further along to a science than many think we are. But maybe that is mostly in my head and I need to share more of it to get sanity checked. 🙂
We have the first read through in choral writing where we discover the phrases and intervals rendered pleasingly by the machine but a struggle for the voice. Then we have to decide which is right.
Sometimes it is easier to compose as a solitary writer. Sometimes it is best to sketch out a melody, lyrics and chords then give the players all the choices about the rendering. With the exception of classical composition, I find it better to sketch then give it to the players who will each add their personal styles. I don’t know if game design is analogous or ever can be. I suppose it can. It seems that it would have to be for a game of some complexity and then playtesting would resolve disputes among the team.
Anthony, assuming you’re replying to me, I’m not so much lamenting that there isn’t yet such a theory, but more that the current trajectory of game development seems to be diverging from ever forming one – partly because computerised games are drifting away from games in the traditional and formal sense, and partly because many good developers talk as if games are intrinsically impossible for someone to design without playtesting.
I’m not against releasing early, iterating rapidly, and testing often – it makes sound business sense, both at the start to establish your product and at the end to apply the polish – but most artists and craftspeople can create a solid piece of work without needing to get lots of external feedback during the creation process. Relying on that external feedback can end up being a crutch, replacing the up-front thought that a game designer would otherwise have to provide. Measuring results isn’t usually the hard part of science – coming up with the theories in the first place is. Experimental physicists can’t get very far without theoretical physicists, for example.
In my experience, that’s largely a myth. Songwriting and playwriting in the professional world don’t work like that. Both have a prototyping phase and a rendering to final product phase that includes plenty of feedback from other musicians, singers in the first and actors, directors, dancers, choreographers in the second. Not being a game designer, no opinion, but having written code, feedback is built in from the interpreter or compiler up front. The myth of the lone hacker has been the source of remarkably bad decisions in systems design. Theorizing without specific experience is largely daydreaming and while that has it’s place, it’s usually just the first step toward the real work. I favor projects where a series of small but focused experiments ground the design or the work.
The most critical decision is picking the members of a team and getting feedback from supporting resources instead of dissolving ones. It is all too common for feedback of the wrong kind to stifle a project and one of the reasons startups succeed where established entities fail.
The professional world isn’t necessarily the same as the creative world. Most musicians don’t take their new song out to play in a few concerts and rewrite it based on the reaction. Similarly most artists don’t show off their painting when it is half done. The work is created by the artists with little or no exposure of it to the audience in the process, because the artist has a strong vision of what they’re setting out to create and, more importantly, adequate knowledge of how best to do it. Sure, the work might well have been more popular had they shown it off at each stage and taken feedback from the floor, but not necessarily better. In fact many would argue that’s commerce taking precedence over art, making the work worse. Why do some game developers feel it should be the other way around?
This is not the same as there being no feedback at all, since the artist may collaborate with other artists, or use tools to check their work (eg. a musician listening to a finished mix in the studio), but there is certainly no assumption that the input of the eventual listener/player/consumer is required to create something worthwhile.
I don’t think that’s true. It’s easy to spot a checkmate situation in Chess but it’s significantly more complex to learn how to bring one about. Of course, the amount of man-hours invested in ‘work’ compared to ‘theory’ is usually higher, but often that an unfortunate side-effect of being ignorant of the theory. If you don’t know the precise route to a solution, then of course you have to resort to working hard until you find one by accident. That’s not a good or desirable thing.
That’s where I disagree. It’s the observation and measurement that drives the theory, not the other way around. The telescope opened the doors for modern astronomy, the microscope for cellular biology. The Wright Brothers were successful at manned flight not because they sat around and thought more, but because they came up with better ways to collect data, and iterated more on their designs.
It seems to me that the more complex a game is, the more essential it is to playtest extensively.
Complex systems rapidly reach the point where it’s impractical to predict all potential outcomes. No matter how strong your theory and math, you need that field data to sniff out the most obvious flaws… and the subtle ones might take months or years in full production to catch.
A non-computer example would be Magic: The Gathering. There have been a number of times in its history when a relatively straightforward card combined with another straightforward card produce a combo so powerful that it dominates the tournament scene until banned or rotated out. The process of rotating sets out of standard tournament formats is itself an attempt to limit the number of combos to a manageable level (and to sell more cards, of course).
It’s certainly possible to construct a Tic-Tac-Toe or Solitaire game that is a “compelling game experience” with no more playtesting than the author provides himself. Zynga is getting quite fat and sassy cranking out titles with little more meat than that. Speaking as a player, I’ll stick with the messy, complex world simulations, thanks.
I don’t actually think that is accurate. As specific examples:
comedians develop their new material in small clubs before committing it to the large venues; many are famous for engaging in this process diligently prior to going on big tours.
it is common for musicals and theater to start out by road testing a new play “in the boonies” prior to attempting to premiere it for real, and it is normal for songs to be added and dropped.
touring musicians regularly test new songs out on audiences, and add and remove them from set lists based on the audience reactions.
writers commonly have “first readers” who are non-writers who provide initial feedback on early drafts
Granted, these are generally in the more performative branches of the arts, but it’s still a very common pattern. I can think of analogous practices in fields such as architecture and the like, which perhaps supports your comment about commerce…
Play testing isn’t really about finding out if people find the game is fun. That a huge over simplification. If it isn’t fun the designer should have a clue already. The point is to refine the game further and see if there are ways to reconfigure the game to increase the fun. This would be similar to a musician trying a few different approaches to a musical passage, an author changing the story outline or text, or many of the other examples above.
When I’ve conducted play tests, I’m not just listening to what they are saying or reading what they are writing, I’m watching their body language and their eyes to pick up on information that might not be spoken or written down.
I tend to like paper and pencil prototypes (or early simple prototypes) because adding art/animation/music often increases the fun, but that fun is less about the game design and more about the experience. Not that the experience isn’t what you are selling at the end of the day, but as a designer I want to focus on the game mechanics.
The art of the game maker should be not that of the guitarist, but that of the guitar maker. The work is utilitarian, and its success hinges on its utility. You can’t finish a fine instrument without handing it to a master player and listening to what he or she has to say.
“…relying on play-testers to tell them what is fun.”
To be clear, this misstates how I see play testing. The designer builds something that they think is fun. Then they verify the actual impact of the design by play testing with real people. Based off the results, they identify broken areas and opportunities. Then they try again. This is a completely different process than having someone tell you what is fun.
One of the delightful metaphors that came out of that particular Twitter conversation was this: Game design is code that runs on human hardware.
However, the human hardware of our players is not like the hardware that runs digital code. It is a messy, highly complex and poorly documented mix of biology, psychology and group dynamics. We sort of know how to write the source code, but more often than not, we don’t understand the platform well enough to predict how it will exactly run.
So we design in the same way that an inexperienced programmer does. We write something, see what it does and then tweak it bit by bit until it works. Or alternatively, we copy and paste big swathes of old code, reusing patterns because they work but not really understanding why.
Now, I dream as much as anyone of finding a great standardized language for writing down our source code. Raph has done some great work with identifying social and mathematical system driving our blackboxes. Stephane Bura keeps identifying patterns that yield specific emotions. I dabble with concepts like skill trees. Joris Dorman is doing fascinating work diagramming using petri nets and feedback loops. These steps help us talk about game designs using crisper language.
Yet there seem to be limits on how well we’ll be able to write designs that are guaranteed to run is a desired fashion on human hardware. I think we’ll never been able to train an unaided designer to think through all classes of design and come up with a 100% reproducible result. For certain designs, there will always catastrophic ‘bugs’ that require iteration.
It is a spectrum. Some trivial classes of design are certainly solvable in the game designer’s head. A choose-your-own-adventure doesn’t seem to have all that many places to go wrong. But other classes are inherently tricky: Games with deep skills trees, multiplayer games with emergent social properties, novel interfaces, and many new-to-the-world game systems. We simply don’t have a good enough simulation of the human brain to know how those design rules will (quite literally) play out.
Is this comparable to other media? I paint and write. In my head I can simulate enough of how others will react that only about 10-20% of the reactions surprise me. Empathy of this sort seems a natural capability of the human brain. It isn’t perfect and high quality results are still immensely difficult. But functional results? I’m pretty damned sure that I can draw a picture of a cat that you identify as a cat. Our human simulation fidelity isn’t perfect by it is massively better than in games (or politics or economics or religion or organizational psychology or other complex human systems)
But when I create a sport that requires 10 people from all sorts of different backgrounds and play styles to come together and interact in a unique complex fashion all while building skills and staying motivated over a period of 10+ years, I suspect I’ll get a broken design 99.99% of the time the first time out the gate. Natural empathy doesn’t yield a predictive model of how this crazy mix of human hardware works. I have only crudest of models that suggest how 10 people will interact. Even worse, my vaunted creative instincts are repeatedly proven to be highly biased. I know both too much (of skills) and too little (of motivations and history). So I turn to empiricism play testing because it helps verify the reality of complex system that our hardwired responses have difficulty discerning.
Play testing isn’t going to go away until we ‘solve’ the human hardware our games run upon. (A worthy goal, btw) Until then, it is fundamental to making games in a way that often is secondary in other creative forms of expression.
take care
Danc.
Raph:
Yeah, you make a good point. But this is where I see a distinction (which I suppose is more of a continuum) between getting feedback on what you’re still working on, and getting feedback on something you already made and feel is complete. I feel a skilled practitioner should be able to complete a work in relative isolation with a high degree of confidence that it is good, without needing to show it to the intended audience at all during creation. I don’t submit that this is desirable or optimal, just that it should be possible. In other words, the comedian’s ‘small club’ material should already be ‘good’.
Yukon Sam:
Nice analogy! But, why do you think it holds true? Why is a game intrinsically utilitarian – I don’t think interactivity alone qualifies it. I would say that a typical game is still largely about a blend of aesthetics and formal rules and both would appear to be things a lone designer with enough skill could formulate in isolation.
Danc:
Totally agreed – but surely that holds true for anything with subjective appeal, many of which we like artists to engage in explicitly without audience feedback. We may not understand the hardware but that doesn’t mean we should shy away from trying to form a model of it. Yourself and Raph are the last people I would suggest are doing such a thing given what you’ve written on the matter, but when you talk about 0% of your game designs yielding a good game without iterative playtesting by others, I think that might send a signal to others that good games simply cannot be ‘designed’ in the formal sense, which is worrying.
Yes, some of us do. We take the songs out and test them on audiences the same way commedians test material. We may move some parts, remove some parts, add extra solos, etc. or drop the song altogether. Some things that work in the studio don’t work live. We also play around with set orders, orchestration, instrumentation. I can’t think of any creative endeavor with an audience where the audience doesn’t affect the piece if it is repeatable except executions.
The lone hacker is mostly a myth for works of any complexity.
Interesting. With my music, I would never change a song based on what the listeners think – to me, that goes against what I think writing your own music is about. Dropping a song from the set, fine, but altering the composition, no.
Writing for hire of course – whether musically or with programming – is a different story, in that it is craft and not art. But motivations aside, it’s still perfectly possible and perfectly common to write a piece without needing or wanting external feedback, and most artists I know work that way when not doing commissioned pieces, making what they want to make, in isolation. Feedback comes between pieces rather than during one. So I’m pretty sure it’s not a myth!
I said mostly a myth. Do it long enough and in enough situations and one does learn to improve the work because of feedback particularly in live performance. In the studio, unless one is doing everything, there are layers of influence. James Taylor was recently talking about this saying that he wrote the song but that once in the studio, each player made up their own part. In the studio, that is also the way I like to work. It improves the rendition enormously to let the specialist do that.
Now can I create and complete a work on my own without feedback? Certainly. I demo that way and some works never get beyond that demo stage. What I mean about it being mostly a myth is that it isn’t the way the best work I’ve done proceeded. Read up on songwriting done in say Nashville and you’ll find that co-writing isn’t just common, it’s preferred. Typically it makes the work better. Again, for works of complexity. I don’t know how this works for games.
I took it that you are saying artists don’t do this. I’m saying that in the main, they do. Painters? No clue. Probably less it being a hand-eye-artform in execution. Feedback here is likely between works, as in what is liked, what sells, etc.
There’s a paper to be written about the Venn diagram for arts, crafts, games and virtual worlds from the perspective of utility and interactivity.
As a life-long gamer, I love games of all shapes and sizes. But I reserve special fondness for games that treat me as a collaborator rather than a consumer. The more room I have to design and engineer the experience for myself and the people who interact with me, the more deeply invested I am in the game.
And that’s one of my favorite aspects of MMOs and virtual worlds. Not only can the physical environment be altered by players (sometimes), but the underlying structure and rules can be modified in accordance to both our expressed opinions and our behaviors (as verfied by data mining and observation). The game is not static — it is, in a very real sense, a continuous playtest of new concepts and systems in which the players are not passive consumers but active members of the development team.
And that’s why I’m much more impressed with a designer who actively engages with the community than one who has an inspired vision that he refuses to compromise. The latter may create great art or even a great game, but the former is more likely to create a great world.
So make me a Stradavarius, let me play it for a bit, and then listen when I tell you that the bridge needs more arch and the tension is too loose. Players can and will do things with your creation beyond your imagination… if you enable them.