Balancing novices and experts

 Posted by (Visited 5751 times)  Game talk  Tagged with: ,
Feb 062014

ninjasquirrelOnce again, another question that came in via Quora. The issue at hand is, what do you do to balance experts and novices in a game? Especially if there are persistent elements like leaderboards in the design, which tend to cement experts towards the top?

This is a big issue as games become more persistent and emphasize multiplayer aspects more heavily. Single-player games now swim in a soup of constantly connected profiles with all sorts of achievement and expertise data, effectively rendering them all multiplayer via the addition of a metagame. And we should not forget: the average player is below average; or to be more precise, the median player will have a win-loss record that is lower than the mean or average win-loss record, because the high-skill players win a disproportionate percentage of the match-ups. This results in the mode for the win-loss record curve being “loss.” (For more on how Pareto curves manifest in this sort of persistent environment, I refer you to my 2003 talk on “Small Worlds” [PDF]).

This sort of accumulated record of expertise can serve as a huge disincentive to participate. Novices will look at high ratings and consider the game hopeless. Nobody likes feeling inadequate. And of course, once in an actual game session in any sort of competitive scenario, it is rare for the match to actually be between perfectly matched opponents. It doesn’t even take a significant skill gap for an accumulated win-loss record between a novice and a ninja squirrel to begin to look pretty dismal. And of course, in skill-based systems that lack infrastructure, people can try to hide their ratings — that’s the basis behind being a pool shark.

There is no way known to solve this issue. In fact, balancing arbitrary teams, for example, is an NP-Hard problem. Fortunately, there’s a pretty standard grab-bag of tricks to ameliorate the issue:

  1. Handicapping before starting a match. In general, this is done by providing the less skilled player with material advantages in the form of greater assets. In grammar terms, these assets might manifest as…
    • A larger amount of whatever currencies the game supports (more money, or time to make a move). This could even go as far as providing a conceptually “infinite” amount of that currency; for example, it is common for novice players in an MMORPG to have “infinite armor class” or whatever to give them invulnerability from other players’ attacks until they are no longer of a very low level. This is usually actually implemented as a flag, but the principle remains.
    • More assets. For example, in chess we remove assets from the more skilled players, as when we “spot the new player a rook.” In grammar terms, an asset often actually translates into a verb.
    • More verbs. For example, we might permit a novice to use wildcards in poker, while not allowing the advanced player to do so.
    • More information. We might adjust things like fog of war, visibility of opponent’s hands, or the like. In tabletop gaming, this often manifests as outright giving a new player advice whilst they play.
  2. Pairing up competitors of comparable estimated skill. The basic premise here is to avoid putting the less skilled in a match with the highly skilled. This ranges from simple stuff like using levels in cumulative-character games, to advanced systems like Elo ratings used in chess and other competitive scenarios.
    • Levels in an RPG-style system tend to suffer from the fact that actual skill is not reflected. Instead, you rely on gating use of verbs, currencies, etc, based on pure devotion to the game. This works well if your game admits of relatively little skill (e.g., emphasizes in-game assets as the metrics of power). It works poorly in a skill-based scenario. And of course, a cumulative character system by definition causes rich-get-richer syndrome.
    • Ratings work better in that scenario, but of course suffer from some flaws. Standard Elo systems don’t account for inactivity, so people can choose not to participate to try to keep their ratings up.
    • Leagues are more or less a combo of the above two: a crude approximation of rating, that you advance through via a levelling process.
  3. Dynamic difficulty adjustment systems. These are more or less described as “handicaps that occur during play.” The intent behind these systems is to keep the competition tight and close.
    • Give more benefits to people who are losing. It’s most visible in racing games, which often improve your car’s handling when you are losing.
    • Give more penalties to people who are winning. In kart-style racers, it’s common for the leader to get hit with more negative pick-ups.In oer racers, AI cars will start driving worse to allow you to catch up.
  4. Randomness. In general, a more random game will equalize the playing field, reducing the value of skill. The random drops in a kart racer are one obvious example, but consider classic children’s board games, which are heavily reliant on dice. Snakes and Ladders uses this mechanism to put even very young children on an even footing.
  5. Upsetting permanent tracking. This is to prevent “rich get richer” syndrome, and can be seen in how championships cycle. If the Superbowl’s results carried over year on year, it would be less and less likely that a team who had never won before would climb in the standings. Classic high score tables have this issue; at any given venue, a machine’s top scores would basically freeze into place, possibly even consisting of a single individual’s many high-scoring games.
    • Wipes are how sports handle it. After a tourney cycle, standings are reset. On MUDs, it was common to do periodic player wipes as the upper ranks not only calcified, but became dull as everyone reached caps and the game economy started to break under the impact of mudflation. It’s also what unplugging the arcade cabinet could do!
    • Time-bounded and social-tie-bounded leaderboards. This means leaderboards that show best this week, best among your friends, etc. This is the commonest means of handling this issue these days. This also works against “rich get richer” syndrome. However, it really is just a delaying tactic; given time, it still falls prey to the overall issues.
  6. Erosion. Some types of persistent game require the advanced player to participate on a regular basis, or their accumulated rewards atrophy. Sometimes the atrophy is in the form of attacks being permitted against them while offline; we see this a lot in persistent world strategy games. Sometimes we see systems that literally consume resources while the player is not around; examples have included “rent” systems in MMORPGs and MUDs, withering or disaster events in social games, and many more.
  7. Multiple ladders. Offering a wide array of ways to consider yourself a winner, or many separate possible achievement ladders to climb, mitigates the shame factor in doing poorly in any given one, while maximizing the opportunity for each player to achieve glory. (Please, do see Jonathan Baron’s classic “Glory and Shame” essay and presentations!)
    • Multiplicity of goals. Players will usually invent these if they are not provided (cf speedruns as a classic example), but things like secrets, achievement or badge systems, narrative branches and completionist rewards, and so on are all alternate goals for the same game system, permitting that many more separate ways to compare players.
    • Roles are a powerful way to allow players to feel special, even if not actually on separate achievement ladders. Classes in an RPG or positions on a team are one way to do this. Typically roles all aim towards one purpose or goal.
    • Orthogonality is even more powerful. If you create diverse activities within one game framework which have completely different goals, effectively different games to play, then you can accommodate a much wider array of play styles and player psychologies. This then permits a far greater percentage of participants to feel valued. Crafters versus fighters in an MMO; modders versus players in a shooter…

None of these techniques are perfect. But you can go a long way by using any one, or indeed all of them, within one game.

  14 Responses to “Balancing novices and experts”

  1. Hi Raph,

    Definitely a perennial problem, but did i miss it our did you leave out the meta-problem? Which is, when you give inferior players (and lord knows I’m one of them) artificial advantages you risk the ire of the more skills by effectively taking the value of their skill away. They can, perhaps rightly, see that as depriving them of something they worked to achieve.

  2. It is an issue, but mostly for the Killer player type I think. Most other types won’t mind especially if the trade off is clearly signposted.

  3. As Raph points out, the skill distribution is lopsided: there are a lot more below average players than above average ones. Even if a designer’s efforts to level the playing field does drive away a certain proportion of the elite players, but attracts a similar proportion of lower skill players, there’s a good chance it will still be a net win.

  4. The danger of handicapping and DDA (aka negative feedback loops), though, is that you have to design them to avoid sandbagging. In a Mario Kart game, best position is often #2, right behind #1, so that when the first place gets hit with the inevitable blue shell you can zoom ahead for the win… but that requires that you race suboptimally.

    Pairing players of similar skill using rankings/ratings is a deep topic that deserves an article all its own, but suffice to say that if you go with something like Elo that attempts to show your true skill level, you’re screwed, because players don’t feel a sense of advancement unless they genuinely get better at the game, which happens too slowly compared to what they’re used to in leveling systems; but if you go with a pure play-to-advance system, you’re still screwed, because you’re pairing players of same experience level but not the same skill level, which doesn’t solve the original problem; and if you use a league system, you’re totally screwed, because it’s like driving a car with a single dial that’s your current speed plus trip odometer, you have this useless hybrid stat that doesn’t really tell you anything and can’t be used for much of a practical purpose.

    Randomness can help, but there are two things to watch out for. One is when “random” actually makes it easier for a skilled player to win (consider something like headshots in an FPS, which seem kind of random since it’s such a small target that you should really only be able to do it by accident, so it’s like this random factor that sometimes you’ll just happen to get a kill by shooting in the right general area… except expert players can actually exploit this to get a greater number of headshots so it becomes much more likely that an unskilled player will get headshotted all the time). The other is how this affects your rating system: Elo doesn’t account for luck (“Garry Kasparov never had to go ten turns before drawing his first Pawn”) so a game where an expert is going to just automatically lose some percentage of games (some to low-ranked players) is not just going to piss off your expert players on a regular basis, but is also going to mean your ratings aren’t a true reflection of player skill, unless you modify the system to mathematically compensate for the luck factor (this can be done but it helps if you know how luck-based your game is).

    Rating resets are VERY game-dependent. In a highly skill-based game like Chess, a reset doesn’t really make much sense; the top players this month are going to be the top players next month because player skill doesn’t change at that fast a rate, so all this does is prevent players from tracking their skill over time; having regular tournaments might work better (the winner of a tournament can change from one month to the next, at least they’re more volatile than persistent ratings due to fewer games played so greater chance of an upset), especially if those carry some kind of time-limited prestige (“Current Title Holder” badge on player profile for the next month, etc.). Professional sports are perhaps not the best analogy here, since you’re talking mainly about TEAM sports, which also change their teams around a bit; for solo sports like Tennis or Golf you do tend to see more of a Chess-like lifetime record/rating that doesn’t reset between seasons.

    Erosion (a special case of rating resets), I never liked much as its own solution, because I just see it as a design band-aid to fix a broken metagame element in the rating system. If your system disincentivizes players to actually play the game, my first reaction would be to fix the underlying broken system, not to layer an erosion mechanic on top of it to force the issue. At best, you then have a situation where players still view playing the game as an undesirable threat to their standing, and will do as little of it as possible to stay above the erosion line… but the whole time, you’re still subconsciously training your best players to think of playing your game as a bad thing, and I don’t think we generally want that 🙂

    Multiple ladders are also tough to pull off and game-dependent; I can see it being much easier to do in an MMO than, say, a fighting game. Even then, too many parallel leaderboards and you run the risk of diluting the glory of being on one of them: everyone’s the best at something, everyone gets a trophy for showing up.

  5. All your points are great! I was just trying for a catalog though… I suspect a lengthy article or even a book could be written about each one of these. 🙂

    In general I think not every one of these works for every given game, but often the *principle* behind one can provide an analogous mechanic.

    Even one on one sports do resets on the tournament level. Basically you end up with two metrics. Take tennis — you have rankings, feeding in to seeding. And you have Wimbledon. There is room for two kings of the hill, which are sometimes but not always the same person.

    Similarly baseball manages orthogonality via the wide array of stats it tracks….

    Erosion is definitely out of favor.

  6. If you’re going for an exhaustive catalog, I think you missed one: lie. There’s no actual requirement that a player actually play another person, only that they think that they are. Using Blizzard’s Hearthstone as an example, which is ideally set up for it, the following is true:

    1. There is no tracking of matches down to the name vs. name level. So there’s no way you can figure out who’s exactly fighting who. (Application of #5 Upsetting permanent tracking)
    2. There’s is a ranking system for players, but for two of the three player modes, it’s completely hidden. (Application #2 Pairing up competitors of comparable estimated skill.)
    4. In the Arena mode, the playing field is leveled using a random deck generator. (Application of #4 Randomness)
    5. There’s limited communication between players. As it stands now, there’s only 8 “emotes” you can do. (New – Apply “magic”.)

    The Arena mode costs actual money ($1.99) to play. Because everyone is even, you’d expect to see 3-3 win/loss ratio (rounded.) But, there is an award scale for up to 9 wins, which in the zero sum universe of Pareto, means that for one person to achieve it, three other (units of) people end up 0 wins for their $1.99. Give that lopsidedness, that is essentially a giefing situation which is bad for player retention. Furthermore, since that’s one of the two ways that Hearthstone monetizes, that’s not OK.

    Given the above, isn’t the “right” thing to do to swap in an AI player once you’ve figured out that a given person is dominating in the Arena? There’s no way that the player could *know* they’re playing against an AI, because another human could be simulated quite effectively using the already implemented Expert-AI. Using that slight of hand, they could be easily made to *think* that they are playing against another person – and that’s all that maters.

    I have no inside knowledge that this is happening in Hearthstone, but I am pretty darn convinced that this is happening World of Tanks using a different application of “magic.” Instead of straight up deceit, they’re applying the same “lack of information” technique to encourage players to believe that their opponent got off a lucky shot, whereas instead the opposing player actually used a actual money purchased golden shell. I read a rather convincing analysis here:

    By limiting the information available to the player, you end up with a What You See Is All There Is (WYSIATI) situation and, with a gentle nudge from us, the player will willing form the conclusion which is most favorable to them, and continue to play the game (for at least a while…)

    Isn’t that the whole point of this catalog? To list how to reduce the risk that players rotate out because they lost?

    In this day an age where we everyone actually has to be “above average” for us to effectively monetize games, application of the principals of magic are required and should be added to your list, even though it isn’t strictly “fair.”

  7. Interesting article, and I was really interested in the slides from the 2003 GDC. Too bad it wasn’t recorded for youtube. Something I think is great for historical purposes and context. Slides never due justice.

    Many of your in-brief citations didn’t make it to the resources at the end. Pity. Given the passage of time and lack of ‘talk’ it would have been nice to follow up parts via the references.

  8. It actually may well have been at least audio recorded. But there’s a lot of the older audio recordings which are not on the GDCVault yet.

  9. Mr. Smedley says they are working on a new MMO that will have Star Wars Galaxies fans very excited. Any chance you might get involved in that project? The Star Wars Galaxies crafting/harvesting/merchant systems were absolutely amazing and the best ever presented in an MMO.

  10. I am not involved with it and don’t know anything about it.

Sorry, the comment form is closed at this time.