In the process of implementing Proposal: New users choose beginner/intermediate/advanced and drop ranked game restrictions, a question has came up around never-used ratings categories. Should these follow the parent ratings category (with a high deviation)?
(For more background, feel free to have a look at the discussion in a related draft pull request.)
The situation
Concretely, say there are three players, A, B, and C, and they each play 1000 (or 10000) games on OGS after creating accounts.
- A plays an even distribution of live/blitz/correspondence and 9x9/13x13/19x19.
- B plays only live 19x19 games.
- C plays 1 experimental game of each category to start. Then the remaining games are live 19x19.
After these 1000 (or 10000) games:
- Player A will have accurate ratings in all categories.
- Players B and C will have accurate ratings for “live 19x19”, “live”, “overall”, with ~1000 (or ~10000) games of history.
- Players B and C will have a provisional rating in other categories, such as “correspondence 9x9”, with either 0 games (B) or 1 game (C) of history right at the beginning.
The question
The question at hand: what should player B’s provisional rating for “correspondence 9x9” games (a never-used ratings category) be?
- Their original starting rank? (E.g., 1500 ±350 until a 9x9 correspondence game is played.)
- Their current overall rating? (Overall rating ±350 until a 9x9 correspondence game is played.)
(Out of scope, for now: what should player C’s provisional rating for “correspondence 9x9” games be? Assume, for now, that it’ll be the status quo, whatever rating they earned with their single game in that category. This is presumably close to their original starting rank.)
Proposed change: system guesses the rating as accurately as it can
IMO, taking player B’s overall rating makes the most sense as the start point. The primary goal of the ratings system is to help players find fun/fair games, and this is the system’s best guess for their strength. Since it will come with a high deviation, it’ll still adjust quickly as they play a few games.
I think this works well regardless of how accurate player B’s starting rank was.
Concerns with making a change here
@anoek brought up three concerns, and rather than discuss just the two of us, he suggested we open it up to the forum.
I’ve quoted his concerns and included my own responses (but keep in mind I’m writing this so it’ll have my bias…).
1. The difference between player B and player C is “not fair”.
Specifically, if you have two people that are about the same rank, one plays a bunch of games with only one setting then it lifts all of their ratings, where as if the other has played even just one game in some other category, now they’re effectively hampered and “have to play more” to get to the same place as the first player. This is a weird incentive that seems like we should avoid, we want people to play whatever they want to play whenever they want to play.
I agree this incentive would be unfortunate. I’m (perhaps naively) optimistic that few players would actively shun game categories to “save” their ratings for later, but it’s possible that it could distort things somewhat.
IMO, we should get accurate ratings for player B if we know how, even if we don’t immediately have a solution to improve the accuracy of player C’s ratings. IOW, the benefit here outweighs the risk.
2. Ranking up is part of the fun.
The act of ranking up is part of the enjoyable and rewarding part of the experience, short cutting that process isn’t necessarily desirable.
Agreed, but I don’t think this would in practice cut that process short.
IMO, when you play a game in a never-before-used category, you’re initially grinding through the “what’s my strength?” process. You want this grind to be as short as possible so that you can get to the fun part, which comes when you’re getting fair games (not when you’re beating up beginners or losing to experts).
Note that player B’s starting rank isn’t necessarily weaker than their overall rank (the grind might be “up” or “down”). E.g., the notional starting rank might be 6kyu (joined before beginner/intermediate/advanced split) or 2kyu (clicked “advanced” because they’re better than their friends—hopefully a rare case, but probably there will be a few). Then they play their 1000 (or 10000) games, settling in at 18kyu (or whatever).
3. The ratings are split into categories for a reason.
People have legitimately different strengths for different settings, sometimes by multiple stones. It’s probably true that starting them off in their fallback rank is closer to their true strength though, so maybe this isn’t really a valid concern.
I agree with both parts. IMO, there’s nothing to worry about there, because we’re just looking for a best guess.
Questions for the forum!
What do others think? Any other concerns?