Tuesday, March 16, 2010

Footsteps – Informal Game Sound Study

Lately, I've been taking stock. Not the usual “What have I done with my life?” or “Where is everything headed?” (although those questions perpetually tumble around my brain stem on a regular basis); I somehow found myself obsessed with the minute details of movement sound and system design. If you're working in games today, chances are good that you've recorded, implemented, or designed systems for the playback of character footsteps and Foley at some point during the course of your career. It's even more likely that you've played a game where, at some point during your experience, footstep sound wrestled your focus away from the task at hand and demanded your listening attention.

Yet, let it be said, all footsteps are not created equal – which seems obvious given that no two games are exactly the same, neither should their footsteps or the way in which they are implemented be (necessarily) the same. At the end of the day, as content creators, we should be slaves to the games we are helping to make and not showboating unnecessarily in our own art by accentuating or spending time of things that have little consequence outside of our own satisfaction; however, for a sound type that may be heard for countless hours across every level in a game, surely they deserve more than a passing thought. (or maybe I'm trying to justify my current obsession!)

Now that that's been said I want to take a moment to recognize the ubiquity of footsteps across almost every genre – they're everywhere! It's with the utmost care and delicacy that this simple aspect of sound be handled so as to lend itself to player immersion, lest the veil of realism drop and expose the pixels on the screen as what they truly are: groups of flashing lights to stimulate the visual cortex. Oops...SPOILER!

*A Squid Eating Dough in a Polyethylene Bag Is Fast and Bulbous, Got Me?

So it is with this soft eared attention to detail that we will delve into the details of movement design from a systems perspective, and try to carve out some common practices. When jumping (hah) into the throng of movement oriented sounds the core components boil down to: Footsteps and Foley.

Once upon a time it was enough to provide a small number of step-only audio files that could be sequenced or randomized in time with the foot falls of the player on screen. In the not so distant past, it was not uncommon to deliver these same steps with a portion of Foley movement embedded into each to help add character to the movement. As available memory and increased polyphony became available, the ability to split these elements into separate assets that could be layered and recombined at runtime by the audio engine became a reality that many have been quick to adopt.

Now that we have these elements separated and playing back independently, we are free to swap out material and clothing types without redelivering our step-only content for each outfit and surface material type. We can keep our content footprint (lol) down by not having to exponentially create combinations, and the flexibility we gain opens up a greater diversity when these elements are shuffled on top of each other during character movement. Coupled with the ability to randomize pitch, volume, and low pass filter between a set of values this content now becomes a seemingly endless array of diversity for any given action.

Developing in parallel to these audio aspects were the refining of animations, increased subtlety of character movement, and increased depth of player control: all of which helped unlock a level of detail necessary to bring Footsteps and Foley towards a model approaching reality. Games that are now reaching for a level of realism have been set free to further granularize their sound content sets to include: additional step types beyond walk or run, to allow for the changing of clothing types, to increase the number and types of movements available to the player at a given time, and expand the list of surface materials a character may be able to move across.

Where once a game may have been able to get away with only walk, run, jump and land it's not uncommon to see scuff, stealth, crawl, or crouch – not to mention the breakdown of heel/ toe, in addition to the Foley layering. Once you factor in the recent spate of free running/ Parkour flavored games which integrate various hand grabs, slides, and other acrobatic feats it's easy to see that the fidelity we're dealing with demands careful thought at every step (heh) of the audio pipeline. From the Audio Director defining the style, the Technical Sound Designers defining the systems, Foley actors portraying these movements, Sound Editors granularizing the files, and Audio Implementers wiring up animations for sound – the potential to infuse the sound with variation and detail that will lend itself to a greater belief in the actions portrayed on screen is immense.

Sometimes writing about sound can be like dancing about architecture, and so I've assembled a series of game video's that focus specifically on footsteps and Foley within the  current generation of games in an attempt to provide perspective. While some might cry foul at the removal of music in the examples presented, the goal was not to assess footstep mix related issues (which could fill up a whole other study) but to allow for a distanced abstracted  critical listening experience. In games where as the player you find yourself running for extended periods with only ambient sound and your perpetual footsteps to keep you company it makes solid sense to focus on the persistent ticking of heels and toes whilst traversing the landscape. If, as the player character, you would never find yourself walking or creeping stealthily between places, then the complete lack of subtlety for this aspect of footstep sound can usually be forgiven.

A comparison of the 3 titles whose core gameplay is centered around the players ability to deftly traverse the environment turned out to present a stunning example of aesthetic choices in sound design, as well as showcasing a high level of detail inherent within each of the system designs. While they all focus on player movement and Foley in a similar way, each communicates a style that I feel services the visual side of the equation in a complimentary way. Where Assassin's Creed 2 is the most fluid and “real” sounding, the added iconic/ earconic elements in the Prince of Persia Foley (recognizable – for example – in the identifiable hand grabbing the ring sound) really lent a sense of satisfaction to a successful execution of this action.

While I didn't grab any footage from the original Assassin's Creed for comparison, this article with lead audio designer Mathieu Jeanson of Ubisoft Montréal from Mix Magazine exposes the specifics utilized in the original:

“the footstep system uses more than 1,500 original recorded samples. We managed 22 surface materials, with 14 different step intentions — sneak, walk, run, jump, land, pivot, et cetera — including three to eight variations for each intention per surface.”

In contrast, Mirror's Edge's movement sound was delivered stylistically in compliment to it's grand visual design – stark, clean, and clearly bristling with crispy audio detail. From an interview at Gamasutra with Owen O'Brien, senior producer on DICE's Mirror's Edge:

 “In most games, footsteps are a pretty simple thing to add, but running and moving was so integral to Mirror's Edge that we had to create a huge library of footsteps and a system to manage them. We had them for different speeds, different surfaces, different landings; the list goes on and on.”


The squad or group based gameplay of Dragon Age and Mass Effect introduces it's own set of challenges when it comes to movement sounds. In support of the player character, you may have have additional party members following your every move. This means as they trail behind you in close proximity, their footsteps can be heard in a chorus of movement mirroring your progression through the level. This can be good and bad, depending. It's likely that, in order to combat the repetition of playing the same sound files twice, the content requirements demand an increased number of variations to keep things from sounding too similar. Also, it becomes critical that the AI for your group members be intelligent enough to keep pace smoothly without the walk, stop, walk stutter which presents itself when the player is walking slowly. Hearing the AI react poorly in this situation is just as bad as watching an AI walk endlessly into a wall – both expose the underlying shortcomings of the systems behind the action and contribute to a lost sense of immersion.

In a list of suggestions for making Mass Effect 2 a better experience, one player cited “less generic footsteps” as a possible solution:

“I never realized how important it is to hear the footsteps of your character while you are stomping around in a game. To be more precise: how important it is to hear those footsteps reflect the surface that you are walking on....With the amount of walking that you do in Mass Effect, the lack of variety can really get on your nerves. Even changing the pitch of the sound somewhat for different surfaces would improve things dramatically.”

The same could be said about many games where a small number of surface types are repeated throughout. Aside from the battle against the grey-ification of our game worlds, as sound professionals we may be able to help by suggesting variances to surface types that would be reflected in the sound of player movement.

One area that Dragon Age succeeds greatly, is in changing out Foley movement sounds based on the type of armor equipped on both the player and any following party members. By enabling the layering technique mentioned above, Bioware was able to communicate the change of armor through sound, and helped add a level of audio feedback to their system. This feedback, while seemingly inconsequential, enables the player to feel connected to their decisions when reflected back at them not just visually in the armor displayed on-screen, but by the resulting sound when moving. The small touches of detail help to sell the player's role in defining the soundscape in their game, and gives a greater sense of involvement in character progression.


The first person perspective brings with it a certain level of disconnect. The oft-cited “floatiness” inherent with the inability to actually see you feet as the player can be difficult to anchor – with any sense of physicality – to the environment. Optimistically, this is a perfect opportunity for sound to swoop in and save the day by providing an audible feedback representation of the action on screen. While the immersive aspect of the first person viewpoint in games is widely debated, I don't think anyone would argue with the the addition of footstep sounds as providing a necessary connection to the game world.

When I first assessed the footsteps in Crackdown I commented in this thread at the Game Audio Forum that “while I think (the footsteps) sounded ok when running/ running fast...the slow walk did have too much thunk to sell the finesse of the movement.” Thanks to a revelatory response by Tom Todia at Engine Audio, I was able to peel back the academic wax that had accumulated in my ears and remind myself that the game is all about grand, larger-than-life gestures. I can imagine spending almost no time in consideration of actually walking during normal gameplay in Crackdown, which I think highlights what was likely a creative decision that the developer needed to make regarding what was (more) important: the addition of walk steps or increased variations of giant explosions!

In a blog post by Raymond Usher at Gamasutra, he discusses footsteps and “the expectation (from the development team) that we 'need' footsteps” and how that might take focus away from other area's that may be more important.

In a discussion of advanced Modern Warfare 2 perks, the subject of changing the footstep sounds as a part of gameplay is overviewed:

“These advanced perks provide a small secondary benefit, usually of little consequence, to the perk’s main function. For instance, the first game had a perk called “Dead Silence” that muffled the sound of that player’s footsteps. While theoretically useful for stealthy players, in practice it was easily outclassed by every other perk in that tier. In the sequel, silent footsteps became a secondary effect of the pro version of “Ninja” (invisibility to heartbeat sensors).”

Another interesting example I stumbled across in this group comes in the form of Fallout 3 Audio Director Mark Lampert's decision to include a slider in the menu options specifically for adjusting the footstep volume. Where it has been common to see volumes for Music, Sound, and Voice, this ability to adjust the footstep sound in relation to the rest of the mix to suit user preference is an interesting addition.


In games that focus on melee combat, or are otherwise consumed with bringing across multiple systems while catering significantly towards cutting a path through a sometimes endless hoard of baddies, the role of footsteps is often downplayed by design with wall to wall carnage. Which is to say that detailed player movement sound is certainly not the focus for any sustained length of time. For each character in Star Wars: The Force Unleashed we had about 10-16 file variations per step type randomly shuffled across 20 materials for walk, run, and land. In Conan we had 4 file variations per foot across 9 materials and 2 step types. There is no magic number when it comes to deciding at what point you have enough variation, often it becomes a complicated equation of space vs. perceived need for diversity.


In the current generation we have 10-20 times the memory of last gen, our tools are more intuitive, and our understanding of the process is greater. What I've come to find in my assessment and experience is that despite this ability, there seems to be two schools of thought regarding footsteps in the current generation, of which Lost Odyssey is currently my poster-child.

1. Maximum variation: Both at the file level (lots of steps) and through randomized pitch and volume.
2. Iconic (Earconic?): The “right” footsteps all of the time with minimal variation.

I think the idea of maximum variation is covered pretty well above regarding our current capabilities, but the idea of Iconic footsteps might still be a bit vague. Essentially the idea of is that, as an aesthetic choice, a minimal amount of footstep variations are chosen because they exemplify the particular character and step type without the need for randomness or variation. The thinking seems to be that if exactly the right (for example) two footsteps for a given step type are chosen, then the action can succeed with minimal variation and the sound of the character can be defined iconically by the sound instead of bowing to a perceived reality.

While the choice to go Iconic for footsteps could be due to limitations of RAM, I'm finding it difficult to believe that this is the case, and so I've started to think that it is a conscious design choice. I'm also trying hard not to label it a "last gen" technique and chalk it up as a hangover from a time when you were lucky to get 2MB of RAM for the entire level of a game. I would like to believe that this is a choice that some designers are making, instead of a response to limitations – especially because of the pervasive presence of footsteps throughout the entirety of a game makes it hard to ignore and certainly is a decision that must be consciously made.

I have no info about the technical choices and tradeoff's that were made in the creation of the footsteps for Lost Odyssey, but it's easy to hear that they have implemented a smaller more iconic set of footsteps for each step and material type with little to no pitch or volume randomization. This is especially apparent when climbing up and down ladders. That said, they sound exceedingly appropriate and are designed to a high level of detail.

(The interactive bell ringing playground equipment is also a must see/hear!)

At the end of the day, it should be about building appropriate systems that support well designed content in order to best portray the action on screen as a way to sell the players role in the environment. Whether this comes with or without fast bulbous squids, it all boils down to the same thing: we're all slaves to the game.

Until next time!

*That's right, The Mascara Snake. Fast and bulbous! Also, a tin teardrop!
*Thank you Captain

Special thanks to contributers to the contributers to the Game Audio Forum post that fueled my fire and our special guests on Episode 2 of the Game Audio Podcast where we discussed some of these things.


I ended up spinning this into a talk given at GDC 2011 and further reprised it at my local IGDATC meeting.

Here's the video:

Concept Art © Aaron Armstrong and extra special thanks for console's, insight, and game saves.

No comments: