Rank Deflation

DVbS78rkR7NVe · January 31, 2021, 4:53am

Riddle me this. New players that come to OGS play with high deviation first “?”. Ranked players who play against “?” have almost no effect on their rating. And “?” quickly converge on their rank in this population. And the ranks aren’t anchored to anything. If the whole population improves 1 stone, the ranks stay the same while getting deflated/harder. And active players improve several stones during their their life cycle: come to server -> get stable rank -> improve -> leave to never be seen again. Wouldn’t that deflate ranks? Why doesn’t it make sense?

Because general tendency of improving at the game is a source of deflation, however small. Where is the source of inflation stabilizing it?

Vsotvep · January 31, 2021, 4:57am

We just did that, by revamping the ranking system

But yeah, usually with Elo-type systems, Elo is not stable over time and tends to shift when generations come and go. That’s why you should only look at the difference in rating between contemporary players.

meili_yinhua · January 31, 2021, 5:06am

That is a fair question, and was a concern with the first ELO formulation, but I haven’t seen a glicko-2 formulation with a singular entry point (1500) that was deflated (OK OGS but that’s because of the rating-rank conversion). Lichess being the primary example is considered a very inflated server in chess where 1500 is actually quite easy to reach, especially in comparison to Over the Board ratings, where there is a history of complaining about inflationary mechanics (such as never being able to fall below a certain floor once you’ve achieved a certain rating). But an eye to that and readjustment of what the ranks are is definitely worth something

DVbS78rkR7NVe · January 31, 2021, 7:30am

Considering players can track their progress on OGS over many years, I wonder if ranks deflate in any significant way.

shinuito · January 31, 2021, 12:15pm

Won’t any online server have essentially inflated ranks for some large fraction of active users compared to over the board ratings. Eg, if I want to change my over the board EGF rating I would normally have to pay a fee of like 5-10€ maybe, play about 5 games, possibly include travel and overnight expenses etc. To update my rating on an online server I just click play a new game. So if I’ve improved I’ll definitely get to a higher rating online first before I get it over the board.

shinuito · January 31, 2021, 12:21pm

I mean two successive ratings updates (almost) erased the fact I started playing Go first on Ogs and was at 25kyu initially. It looked like I more or less started and stayed at 15kyu for a decent amount of my games after one update. Now it looks like I started and stayed at like 9kyu forever.

Of course the chat still remembers.

Maybe I should try and grab my rank data from the chats and see if I can reconstruct my rating/rank over time and plot the discontinuity from the rating system updates.

yebellz · January 31, 2021, 12:53pm

Another factor to reduce these types of effects is that Elo-like systems are not purely zero-sum. After a game between a provisional player and a player with a well-established rating, typically the provisional player gets a much larger adjustment while the established player gets a smaller than normal adjustment. This helps to figure out the new player’s rating quicker (by taking larger adjustment steps at first) while not impacting the ratings of established players as much.

gennan · January 31, 2021, 1:37pm

I quickly checked some of the games in your game history. I’ll estimate your level in real life club ranks as I am familiar with in the Netherlands. These may be a bit softer than current EGF ranks and a bit tougher than AGA ranks.

From your very first games here, I would say you were not a complete novice when you started on OGS. Did you not have any practice or tuition before you started on OGS?

Around april 2017, I would estimate that your level was about 15k. Your (updated) OGS rating shows 9k at that time. Indeed that feels a bit too optimistic to me. But your winrate was quite high throughout 2017 (about 75%), so that might explain why the OGS rating system gave you an inflated rating.

By april 2019, I think you were about 10k. Your (updated) OGS rating shows 7k at that time. So perhaps still a bit optimistic, but AGA 7k may be quite a good estimate.

By late 2019, I think you were about 6-5k. Your updated OGS rating shows 4k at the time. Still slightly optimistic perhaps, but 4k AGA may be quite a good estimate.

I also notice that you win about 10% of your ranked games by your opponent timing out, while you seem to timeout much less. It makes me wonder how much OGS ratings are affected by time outs. Timing out is quite rare in real life tournaments, so it’s not much of an issue there.

_KoBa · January 31, 2021, 3:02pm

Yeah, its not a zero-sum game. Especially since ogs is probably the most beginner-friendly server, it is quite typical for new users to gain strength and “steal” rating points from the dinosaurs who have stopped improving years ago. You know the saying “If you don’t move forward, sooner or later you begin to move backward”?

The slow rank deflation is kinda inevitable, so i guess thats why ogs has had and is going to have rank adjustments in the future too every few years.

I feel you, its a real shame that old rank data is gone :<
I also learned to play go on ogs and was stuck at 30k for months, but according to graph i have never been lower than 13.6k ^___^

𝓙𝓪𝓬𝓴𝓩𝓱𝓪𝓸 · January 31, 2021, 3:40pm

IKR, sooooooooooooooo unfair

thouis · January 31, 2021, 3:50pm

I wonder if one could apply some of the recent research in AI evaluation of human games to consistently estimate player strength against a fixed ruler. One could bend some of the recent research (like this paper) to the problem, pretty easily.

Even with a very noisy estimation, evaluating a few hundred games of active players would give a very tight estimate of how much rank is shifting over time, and allow an easy adjustment to be calculated to make them equivalent. Maybe I’ll hack something together this week.

Vsotvep · January 31, 2021, 3:54pm

I’m not sure that would work well: it measures how well players would match against the AI, but since over time there have been a lot of new strategies and new joseki discovered, players from different time periods may end up being mismatched not because of lack of skill, but because of unfamiliarity with each other’s style.

For example, any player inadequately handling the 3-3 invasion would be deemed weaker, while it was simply the case that we didn’t know tha adequate way of handling the 3-3 invasion before AlphaGo came along (we still might not, in fact). It doesn’t have to do with strength, per se, but rather with a lack of knowledge.

DVbS78rkR7NVe · January 31, 2021, 4:10pm

That’s the core of my question. Of course, current ranks can be fixed with update, but we recalculate them into the past, how can I trust past ranks? OGS is lucky to have players who played for 5 years, for 10 years on OGS. When I see their graph over many many years, how much can I trust that it makes any sense?

If I see a players who’s plateaud for years, are they plateaud or they improve together with deflation. If a player is getting weaker over the years, are they getting weaker due to age or something, or their rank deflates.

It’s a worthwhile question to consider when go is a game where improvement can take years.

thouis · January 31, 2021, 4:11pm

Maybe. I think much of the evaluation of player strength would be based on the less novel moves. AI has introduced a few new key moves, but much of the game still consists of choices outside of those new plays.

DVbS78rkR7NVe · January 31, 2021, 4:34pm

And in the first place the question came up from looking at my very older games where I’m supposed to be almost the same rank. And I can see clearly that I learned a lot since then. But I’m almost at the same rank. So I wonder, was this really all for nothing. All the learning and studying of mine didn’t change anything.

So yeah, the question is whether it’s something that can be answered, controlled for.

DVbS78rkR7NVe · January 31, 2021, 7:43pm

Come to think of it. Maybe even bigger factor is changing population size. In recent years OGS population is much bigger. Number of games played is twice, thrice the number we had before. It might also have an effect on how ratings are spread out.

shinuito · February 2, 2021, 3:06pm

I think also just the availability of resources now should make it easier to get stronger.

I know not everyone likes watching go videos and lectures, reading go books, reviewing with ai, but there’s a lot more options out there now and very easily available, and not too expensive in some instances.

One could just rely on playing enough games, and possibly reviews against stronger players of course, but there’s nothing to say that the strong players or opponents won’t have used the newly available material and want to try it out, or use it to explain things in reviews

That could be a small factor in slowing progress in some cases – we know we’re improving but maybe so are the people that were also around our level

DVbS78rkR7NVe · August 14, 2021, 1:45am

I found a comment of mine from like 2017. I said there that I’m 6k and with recalculation it’s 1k.

Uberdude · August 14, 2021, 6:50am

I’d say there’s rank deflation because there’s 10 kyus and even SDKs who don’t know how to score a finished game. Back in my day that’s something you learnt around 30-25 kyu.

gennan · August 14, 2021, 10:13am

Wouldn’t that be rank inflation? The rank corresponds to a lower intrinsic value (=skill) than usual.

I feel that after the last rating system update, OGS low dan ranks are somewhere between AGA and EGF low dan ranks, which was one of the goals of the update.

But I think OGS kyu ranks weaker than about 5k got inflated, and more so for weaker ranks. 10k OGS may be about 14k EGF. 20k OGS may be about 30k EGF.