2021 Rating and rank adjustments

aesalon · February 20, 2021, 11:12am

Yes, possible and likely ideal but this is a data driven system and the extreme tail of players combined with the fact that handicap games are virtually nonexistent make it difficult.

As an aside, I do wonder if handicap being disabled by default might actually hurt new player retention since they are losing too often.

gennan · February 20, 2021, 11:19am

Well, if your data is good enough for the range 15k-20k, you can just extrapolate the formulae downward. There is no reason to believe that there will be some hidden discontinuity below 20k.

Yes, I feel that not playing handicap games misses great opportunities when learning go. It’s a great feature of the game.

Lys · February 20, 2021, 11:50am

I think it’s a great idea.
We already have some stats from Salt Cat Mad Handicap tournament first round and second round just finished.
They are both about old rating system.
Third round just started with new ratings, so we’ll compare them eventually.

But a new tournament with lots of players would be fine to gather even more data

gennan · February 20, 2021, 1:22pm

I don’t play correspondence, but I’m definitely up for some experimental live games with OGS DDK players to test high handicaps on smaller boards. But before I do, I wish someone would answer this question:

gennan · February 20, 2021, 6:28pm

I get the impression that nobody knows, so I did some research.

Apparently, there is a known bug in OGS’ automatic handicap calculation on 9x9 (this might explain why nobody knows). So whatever handicap is calculated now, it is not what OGS intended. So better set it manually (but it makes me wonder how those OGS 9x9 handicap tournaments can work with broken automatic handicap).

I could not find how OGS’ automatic handicap 9x9 should be computed. AFAIK there is no universal standard for this, but the AGA has a standard and there are several variants used in Europe. I picked a few and plotted a comparison chart:

To make that chart, I used 2 x 6.5 komi for a full handicap stone to allow combinations of handicap stones + komi in each curve. The 6.5 komi value is what KataGo considers fair for 9x9 even games. The numbers on the vertical axis correspond to “full” handicap. So 1 handicap means 2 stones with black giving 6.5 komi.

In this chart you can see:

The AGA is the most generous with handicap: 1 stone per 4 ranks.
Next is Baarle (NL) youth club system with 1 stone per 6 ranks (it has a staircase shape, because it doesn’t use komi).
Then comes BGA/Cambridge system with 1 stone per 7 ranks.
And finally, Nijmegen (NL) is the most cringy with handicap: only 1 stone per 7.5 ranks (but that’s still close to the Cambridge system).

wenmorj · February 20, 2021, 8:29pm

Concur about handicap games, especially for beginners. When I first started playing (IRL, way before computers)… handicap was essential, but basically a guess (if I won last time, I get fewer handicap stones). Now that with data and computers figuring out proper handicap… OGS does not promote their use. What a backward system.

meili_yinhua · February 20, 2021, 9:24pm

Is it disabled by default? I can’t check custom games cuz I’ve since modified them, but for automatch settings Screenshot 2021-02-20 at 1.22.31 PM “Default is enabled”

Now I personally set my automatch settings to prefer no handi (and I’d imagine a lot of my opponents do), but that’s not because of OGS’s default settings, but an active choice of mine.

benjito · February 20, 2021, 9:27pm

On blitz, default is disabled. Not sure why the discrepancy

meili_yinhua · February 20, 2021, 9:30pm

ah, I forget blitz is a format ppl use sometimes. It makes sense for me intuitively to not handi blitz, but I can’t explain why… other than maybe that blitz games already allow quite a few upsets just due to time pressure

Feijoa · February 20, 2021, 10:30pm

It would be nice if the bug would get fixed. But if tournaments can work without handicaps, they can probably work with whatever limited handicaps we have now.

When my tournament starts in about an hour, we can collect at least 20 datapoints on rank difference vs. handicap and komi for 9x9. Of course Salt Cat will reveal many more.

Feijoa · February 21, 2021, 4:08am

Okay, it has started, generating a fun range of handicaps up to 8 stones with negative 5.5 komi. The six games with a rank difference less than 4 (including mine, sadly) have no handicap (the so-called “1”) and a 3.5 komi. Only two participants immediately resigned, so the handicaps must be fair enough

I don’t understand some of the assignments. Consider these two games:

Game A is 24k vs 4k, a difference of 20, or 20.4 if you go by the full decimal ranks. Handicap of 6.

Game B is 14k vs 6d, a difference of 19 or 18.7. Handicap of 7.

Why would Game B have a higher handicap than Game A?

teapoweredrobot · February 21, 2021, 7:45am

I noticed from the last two games in the table that the three rank difference game has no stones and Komi of 3.5 (instead of 5.5 right?) But the four rank difference game has 2 stones and 0.5 Komi which is a massive jump!

teapoweredrobot · February 21, 2021, 7:47am

Grand B also has 2.5 reverse Komi. It seems that the system considers a couple of points of Komi to be worth a stone or two maybe?

[Edit: just realised I’m making no sense as the Komi makes it worse… Ignore me]

gennan · February 21, 2021, 11:26am

It looks like OGS gave AGA handicap for 22 ranks difference in game A (6 stones and 0.5 komi) and AGA handicap for 25 ranks difference in game B (7 stones and 2.5 reverse komi). AGA 9x9 handicap is already the toughest out there for white, but OGS 9x9 handicap seems to be even tougher (but this might be a bug?).

With Cambridge 9x9 handicap, game A would be about 3 stones and 5 reverse komi, while game B would be about 3 stones and 1.5 reverse komi.

With Baarle 9x9 handicap, both games would be 4 stones without komi.

My feeling is that AGA 9x9 handicap is too easy for black. If OGS really uses that (or an even more generous handicap system) for 9x9 ratings, I expect that the OGS 9x9 ratings of weaker players will become inflated relative to their OGS 19x19 ratings.

Lys · February 21, 2021, 1:48pm

Feijoa · February 21, 2021, 4:01pm

This was a good clue. I submitted an issue to GitHub with more details and my guess about what’s going on…that handicapping is still using the old ranks!

FritzS · February 22, 2021, 8:41pm

I agree with you. Your current rating is absolutely not correct.
I know that because I looked at some of your games.

But you have to consider that the rating system does not look at your games!
It only looks at your results.
The rating system ignores that out of your three wins, one was against a bot which obviously has bugs and one was a win because of timeout.

The fact that your current rating is not correct is not because the rating system is flawed. It’s because of statistics and the fact that you have not played many games yet.

Enjoy playing against a stronger opponent next time where you have the opportunity to learn something. Expect to lose and celebrate if you live at least in some area of the board with a couple of points.

With your results so far and your current rank there is absolutely no basis for blaming the rating system.

FritzS · February 22, 2021, 9:17pm

It’s very unfortunate that the debate got so heated, because at the very foundation the question is a very interesting one.

Aligning 1d with AGA and EGF and changing the ranks so that one rank difference corresponds to one handicap stone absolutely is a strategy you can find many good arguments for.

@david265’s gut feel to design the rating system in a way that ratings should be a stable estimate of playing strength which should not be changed and should be comparable - ideally across the globe - is also very reasonable.

As you can’t do both at the same time, you have to pick one of the options. Which you did and which is ok.

Having said that, my personal take is, that changing what a certain rank represents in terms of playing strength, is probably the boldest decision you can take when it comes to managing a rating system. It’s even bolder if the resulting ranks after the change differ even more from other major rating systems on the planet than before the change - which seems to be the case.

I would absolutely appreciate it if you guys would continue to be open to the second option of changing the system: Trying to stay as close as possible to other major ranking systems on the planet (for all ranks) and instead tweaking handicap stones and komi accordingly so that players with different ranks can play truly even games.

Actually: You could even let the user (via a per user setting) decide how she wants the Glicko rating to be mapped to kyu/dan ranks! I don’t say this would be a good idea but it would absolutely be possible as the ranks are just a mapping from Glicko, right?

Keep up the good work!

Vsotvep · February 22, 2021, 9:27pm

I don’t think this is the case. The current ranking system is very close to AGA and EGF in the higher ranks, and is similarly “steep” to the systems used by the Chinese and Japanese associations (a bit less so, even, so in between the Western and the Eastern associations). Furthermore, the EGF is working on an improved rating system that would bring the EGF ranks closer to how it now is on OGS (not with the goal of being close to OGS, but independently). This would of course have the effect that OGS ranks are close to the ranks of one of the major rating systems on the planet (though it could be argued that OGS itself is major enough to make its own decisions).

I don’t believe a rank can ever be comparable across the globe, since the spread between the ranking systems is simply too large, thus there may always be people unsatisfied with how different their rank is from what they expect locally.

Then again, ranks are really only useful as a predictor between players in the same pool. The clearer and better such prediction happens, the more useful the rating system. If I’m perfectly ranked in e.g. the AGA, but it turns out that playing an opponent 3 stones stronger with 3 stones handicap is actually not a fair match, then we can only use the relative ordering of players by rank, but to compute the amount of handicap we’d need to do something more complicated than only look at the difference in rank. Of course, that generally doesn’t happen, which makes tournaments with handicap in practice a bit unfair.

I really like this idea! We could even offer automatic conversion to certain other servers / associations.

david265 · February 22, 2021, 9:53pm

If everyone can determine their own mapping to a rank, then it would be impossible for OGS to automatically find a partner for a game, or to determine the appropriate number of handicap stones. At least, as a number of folks have pointed out, the current system supports these two features well.

My final opinion is that there should be at least three rankings for each person: one each for 9x9, 13x13, and 19x19 games. Even when the rankings happen to be the same numerically, the computation of the rankings must be very different, since a stone has such a different effect on each size board. In my case, I would have a reasonable 9x9 ranking given my playing strength, which is not the current 9k. I would also have “?” as my rating for the two larger boards, since I have not played enough games recently to compute a ranking reliably. Obviously, no one should be given an erroneous rank (better than that would be to let people specify their own provisional rank).

I would think any intelligent go player would agree with me on all these points.