Racial Balance in Real Time Strategy Games

Hey folks,

Today I’ll be discussing racial balance in real time strategy games. I’ll explain how I think about the subject and address some myths that frequently pop up in balance discussions.

What is racial balance?

2016-12-10_aligulacbalancereport
The Aligulac Balance Report

When players think about perfect racial balance, they typically define it as the better player winning the vast majority of games against an inferior player, regardless of the races that the two players selected. This definition is occasionally combined with an opposition to RNG mechanics, meaning that a player should win all games against players of lesser skill. Another common way of looking at it is to say that two players of equal skill should trade wins and losses at a 50:50 ratio, again regardless of the races that they select.

While these sound like good approaches, they suffer from a number of flaws that make them unworkable.

One is that they overgeneralize the concept of skill. Real time strategy games demand a host of distinct skill sets that grow and develop individually. Each individual game will emphasize particular skills over others, meaning that the overall skill level of any given player will have an uneven impact on their ability to win any single game.

In other words, I might have excellent macro fundamentals, but my unit control might be relatively weak. This means I might be able to beat a good player who plays a defensive macro style, but I might lose to an overall weaker player who does an all-in that demands very strong unit control from my end. It’s difficult to draw a balance conclusion from this.

Another flaw with these approaches is that they overgeneralize match-ups. Even if we were to find two players at precisely the same skill level, this doesn’t mean that they will trade wins and losses at a 50:50 ratio. Perhaps one player’s style fits better with the current meta; perhaps the match-up actually requires an uneven differential in skill, such as a Terran player needing better multi-tasking and a Protoss player needing better decision making; perhaps one player was just having a bad day and couldn’t focus.

Aside from these problems, there’s also a more practical issue. These approaches are very general, which means they really only serve to identify generally obvious balance problems. If I play against players of an “equal” skill level and find that Hydralisks are too strong, and you play against players of an “equal” skill level and don’t find this issue, how do we reconcile this difference? Aside from consensus, there isn’t a good way to provably determine whether a balance problem really exists.

A better way to think about balance is to emphasize individual games over generalities and optimal play over equal play. Here’s how I define a balance problem:

Assume two players A and B playing races X and Y. There exists a racial imbalance between X and Y if, in a game between A and B, the player that lost cannot be reasonably expected to improve their play significantly enough to have reliably won the game.

This definition focuses its attention on a measurable balance problem that can be replicated. Rather than trying to equalize the number of mistakes each player made, it concentrates instead on the more tractable question of whether a player can reasonably improve themselves out of a problem.

The advantage of this approach is that it does not require a general theory justifying why an imbalance exists – it can be identified with only a single game and consistently verified by replaying the actions in that game.

The designer can even apply it directly to find an issue:

  1. The designer suspects there’s a racial balance problem between races X and Y.
  2. The designer finds two players, A and B, of roughly equal skill level with the two races.
  3. The designer has the players play a game.
  4. The designer reviews the actions of the losing player and identifies the minimum necessary set of changes required to enable that player to have won the game.
  5. The process is repeated until the minimum necessary set of changes required to win the game becomes unreasonable.
  6. The win ratio at this point is examined. If it’s deviating significantly from 50:50, a balance patch is introduced.
  7. The process is repeated until a near 50:50 win ratio is achieved.

To reiterate: we focus on concrete instances – individual games – rather than generalities. We strive for a roughly 50:50 ratio at or close to the optimal level of play rather than just an “equal” level of play.

Considerations

The principle at the heart of this approach is the same principle at the foundation of well-balanced games – a high skill ceiling. Players are expected to take reasonable steps to improve their way out of losing situations instead of immediately pointing to a balance problem. This means that most balance issues are resolved on their own as the meta-game develops and the best strategies are found and perfected.

Astute readers will notice that this hinges on the meaning of reasonable. How much can we reasonably expect a player to improve in order to turn a losing situation around? That depends on how high we set our skill ceiling. As long as players are noticeably below it, we should be able to identify a reasonable set of mistakes that they can work on in order to win games.

Games like StarCraft II take this to its theoretical limit by explicitly designing for an unreachable skill ceiling. This means that even the very best professionals’ balance complaints can be rebuffed if the designer can convince themselves that these players can still substantially improve.

I think this is one of the reasons that widely perceived balance problems in the early game tend to be patched out faster than similar problems in the late game. The Adept nerf at the beginning of Legacy of the Void is a good example. Its window of imbalance in the PvT match-up occurred very early in the game. The number of potential mistakes in the early game is inherently less than in the late-game, meaning that it’s much easier to theorize or physically demonstrate that a balance problem exists – eventually, the Terran player’s minimum set of required changes would become so vast and focus on such minute details that it would become unreasonable to expect them to improve their way out of this situation.

This points to a flaw in this approach – and, for that matter, the general approach of balancing through a high skill ceiling. It’s very time-consuming to balance the late game with this process because of the sheer number of mistakes that will be made by this point in the game. There’s a trade-off here for the designer. The upside is that the large number of mistakes means there will almost always be enough opportunities for a player to improve their way out of a losing situation. The downside is that the time required to optimize the match-up in order to demonstrably prove an imbalance will be very long – this period of “wait-and-see” can leave a bad taste in players’ mouths.

The Ultralisk in Legacy of the Void is a good example. Blizzard waited for months as players optimized the TvZ match-up in order to see if Ultralisks were a genuine balance problem. But because of how visually jarring this interaction was – seeing marines do virtually no damage to a single unit – it generated resentment within the Terran player base.

As a result, the designer needs to strike a balance between this optimization process – often referred to as “letting the meta settle” – and theorycrafting what the meta will look like once near-optimal play has been achieved. This enables them to proactively identify and test small changes in order to smooth off the rough edges in the gameplay experience.

(For what it’s worth, I think this is why Blizzard focuses so heavily on the feedback from Korean professionals. It’s not so much that they’re better at the game – it’s that their professionalized approach to practicing means that it’s much easier to systematically identify balance issues).

Myths of racial balance

Balance is always a hot-button issue in any real time strategy game, and over the years I’ve come across common lines of thinking that address the problem in a fundamentally flawed way. I call these myths – let’s go through them.

Myth: Balancing for the best means an imbalanced experience for everyone else

I’ve seen this argument made since the early days of Age of Empires 2. At the time, it was claimed that professional playtesters didn’t accurately reflect the skill level of the average player, and therefore produced racial balance that only worked for the very best. This argument has lived on to this day, focusing now on the feedback that professional players give. If the developers only balance the game around the issues that top players experience, so the thinking goes, the game won’t be balanced for anyone else.

The problem with this train of thought can be found directly in our definition of balance. A problem does not exist if a player can reasonably improve their way out of it. The best players have already demonstrated that it’s possible to play the game better than everyone else, and therefore it’s always reasonable to expect others to resolve a perceived “balance issue” simply by improving their level of play.

In fact, this myth gets it backward – the truth is that professionals are the only players who are substantially impacted by balance problems. The more professionalized the game, the more likely it is that players at the top are reaching the physical and mental ceiling on how optimally the game can be played. Even if the minimum set of changes required to win a particular game is small, it’s conceivable that no human could ever resolve these issues without making additional mistakes somewhere else. Only professional players will run into this, meaning that the designers have a critical obligation to balance the game around their feedback.

Some readers may counter that this sounds like a circular argument – we disprove this myth by stating a definition that’s mutually exclusive with it. The key point is that the definition’s emphasis on optimal play is essential. If we limit ourselves merely to “equal” play, it’s very difficult to determine whether a loss was caused by a genuine balance problem or a difference in skill. “Whose mistakes were ‘more severe'” is an unanswerable question – furthermore, it’s very challenging to compare the impact of a balance problem with the impact of any given gameplay mistake.

The burden of proof is on those who want to demonstrate that a perceived imbalance exists. It’s very straightforward to show that losses outside of near-optimized play could have been reasonably avoided by fixing some simple mistakes; it’s very difficult, if not outright impossible, to demonstrate that a balance issue was a stronger contributing factor.

Myth: Good players understand balance better than casual players

This is a common line of thinking in many activities – people assume that someone who’s good at something understands it better than someone who’s not so good at it. This is not always the case, especially in a game so execution-focused like StarCraft.

Remember, there are two ways in which we can identify a balance problem. The first is to optimize one’s level of play until a problem can’t be resolved without an unreasonable or impossible increase in skill level. The second is to theorycraft about the game’s systems and predict what will become a balance problem once the meta has settled and play styles have been optimized.

Professional players are very good at doing the former – it’s their job. They’re also usually very good at the latter. Because they define and drive the meta rather than simply following it, professionals frequently have a deeper understanding than most of how a game’s systems work. This means they can more reliably predict what will become an issue before everything has been optimized.

Casual players don’t even attempt to follow the meta or play competitively, so they can’t conceivably take the first approach. But their unorthodox play styles allow them to encounter situations that better players often do not, giving them a unique insight into the way a game’s systems interact.

This might seem like a minor point because any serious imbalances will always be rooted out by professionals. But we need to think about how balance problems get introduced in the first place. Sometimes they stem from “minor fixes” introduced to “improve quality of life” or “add variety”. The Japanese civilization dominated Age of Empires 3 in WCG 2008 because it was overbuffed by a series of “minor gameplay improvements”. I think the overbuff to the Hydralisk in Patch 3.8 follows a similar pattern. It’s a unit that most good players rarely used after it fell out of the ZvP meta, so very few of them proactively provided negative feedback concerning its changes.

Casual players are often well-positioned to identify these kinds of problems ahead of time because they get themselves into situations that are far outside of the current meta. It’s not to say that all or even most of their feedback is inherently valuable – just to say that you can find a surprising amount of insight if you read enough of it.

The players in-between – “good but not great players” – often have the least valuable insight into balance. There’s little incentive for these players to understand how the game really works – the oft-repeated advice to this group is to copy strong build orders, focus on clean mechanics and work on macro fundamentals. This is a great way to get better at the game, but a terrible way to understand it better. Players in this category don’t have the raw skill to show balance problems through optimized games, yet they also have no incentive to understand the game well enough to appropriately theorycraft about it.

My point isn’t to claim that good-but-not-great players are incapable of understanding balance; it’s that their incentives drive them to follow the meta rather than thinking about why the meta is the way it is. As a result, there’s no good reason to place extra weight on their feedback based purely on their ranking.

Last word: Balance issues are often a symptom of systemic design problems

So far, we’ve proposed an approach for effectively identifying and resolving a balance problem which concludes with a series of numerical adjustments. What’s left is discussing the root causes of these issues – the design patterns that make balance tweaks more or less likely.

A high or unreachable skill ceiling is a good example of a design pattern that reduces the need for balance updates. It ensures that there are lots of opportunities to improve one’s way out of a losing situation.

Highly binary mechanics are an example of a design pattern that frequently necessitates updates. The range of outcomes in a binary mechanic, such as the Mothership Core’s Pylon Overcharge or the God Powers in Age of Mythology, is inherently more limited than something more incremental. This puts more pressure on the designers to balance the interaction perfectly because the range for improvement is also more limited.

An overemphasis on melee units is another good example of a mechanic that’s hard to balance. Melee units have a comparatively harder time finding damage because it’s difficult to use positioning to gain an advantage – one way or another, the unit needs to be adjacent to the opposing army in order to engage with it. Ranged units can stand at the periphery and pick off units one by one, allowing players with good control to trade efficiently and prevent small disadvantages from snowballing. This puts more pressure on the designer to balance a melee unit perfectly in order to make it work, and it’s worth noting that no new melee units were introduced by either the Heart of the Swarm or Legacy of the Void expansion packs.

(I previously commented that the Hellbat is an exception to this, but a commenter correctly pointed out that it has 2 range).

It’s important to look past specific issues and see if they’re a symptom of an underlying systemic problem. The strength of Zerg in early Legacy of the Void prompted calls for nerfs to all sorts of things, from Adrenal Glands to Tissue Regeneration. In the end, the problem was more systemic; the underlying map pool heavily favored Zerg. The race was even buffed in subsequent patches.

Conclusion

Racial balance in real time strategy games is always a tricky topic to discuss. Players get emotionally invested in making sure their race is treated fairly by the developers. That’s often a good thing – it ensures an open, vibrant discussion about issues. Every viewpoint will eventually find an advocate, making it much easier for the developers to identify the real problem.

I hope you’ve taken away a new approach to looking at this issue, or at least have some additional food for thought. Thank you for reading and I’ll see you next time.

(P.S. If you enjoyed this article, please consider following me on Twitter to receive regular content updates and checking out my game-design-focused Twitch or YouTube channels.)

3 comments

  1. I’m just happy to see that SC2 is the most balanced it’s every been. Feels like we’re going in the right direction.

    Like

  2. […] While StarCraft benefits from being a 1v1 game, that isn’t enough. For instance, if it were badly balanced, it would drive players to exploit the current meta rather than deeply study the game in its entirety. Balancing properly is a separate and very hard problem, but it needs to be done right in order to achieve skill-outcome convergence. I won’t go into that topic in detail in this article, but I’ve written about it at length. […]

    Like

Leave a comment