Rank Deflation

Riddle me this. New players that come to OGS play with high deviation first “?”. Ranked players who play against “?” have almost no effect on their rating. And “?” quickly converge on their rank in this population. And the ranks aren’t anchored to anything. If the whole population improves 1 stone, the ranks stay the same while getting deflated/harder. And active players improve several stones during their their life cycle: come to server -> get stable rank -> improve -> leave to never be seen again. Wouldn’t that deflate ranks? Why doesn’t it make sense?

Because general tendency of improving at the game is a source of deflation, however small. Where is the source of inflation stabilizing it?

We just did that, by revamping the ranking system :wink:

But yeah, usually with Elo-type systems, Elo is not stable over time and tends to shift when generations come and go. That’s why you should only look at the difference in rating between contemporary players.

2 Likes

That is a fair question, and was a concern with the first ELO formulation, but I haven’t seen a glicko-2 formulation with a singular entry point (1500) that was deflated (OK OGS but that’s because of the rating-rank conversion). Lichess being the primary example is considered a very inflated server in chess where 1500 is actually quite easy to reach, especially in comparison to Over the Board ratings, where there is a history of complaining about inflationary mechanics (such as never being able to fall below a certain floor once you’ve achieved a certain rating). But an eye to that and readjustment of what the ranks are is definitely worth something

Considering players can track their progress on OGS over many years, I wonder if ranks deflate in any significant way.

1 Like

Won’t any online server have essentially inflated ranks for some large fraction of active users compared to over the board ratings. Eg, if I want to change my over the board EGF rating I would normally have to pay a fee of like 5-10€ maybe, play about 5 games, possibly include travel and overnight expenses etc. To update my rating on an online server I just click play a new game. So if I’ve improved I’ll definitely get to a higher rating online first before I get it over the board.

2 Likes

I mean two successive ratings updates (almost) erased the fact I started playing Go first on Ogs and was at 25kyu initially. It looked like I more or less started and stayed at 15kyu for a decent amount of my games after one update. Now it looks like I started and stayed at like 9kyu forever.

Of course the chat still remembers.

Maybe I should try and grab my rank data from the chats and see if I can reconstruct my rating/rank over time and plot the discontinuity from the rating system updates.

Another factor to reduce these types of effects is that Elo-like systems are not purely zero-sum. After a game between a provisional player and a player with a well-established rating, typically the provisional player gets a much larger adjustment while the established player gets a smaller than normal adjustment. This helps to figure out the new player’s rating quicker (by taking larger adjustment steps at first) while not impacting the ratings of established players as much.

2 Likes

I quickly checked some of the games in your game history. I’ll estimate your level in real life club ranks as I am familiar with in the Netherlands. These may be a bit softer than current EGF ranks and a bit tougher than AGA ranks.

From your very first games here, I would say you were not a complete novice when you started on OGS. Did you not have any practice or tuition before you started on OGS?

Around april 2017, I would estimate that your level was about 15k. Your (updated) OGS rating shows 9k at that time. Indeed that feels a bit too optimistic to me. But your winrate was quite high throughout 2017 (about 75%), so that might explain why the OGS rating system gave you an inflated rating.

By april 2019, I think you were about 10k. Your (updated) OGS rating shows 7k at that time. So perhaps still a bit optimistic, but AGA 7k may be quite a good estimate.

By late 2019, I think you were about 6-5k. Your updated OGS rating shows 4k at the time. Still slightly optimistic perhaps, but 4k AGA may be quite a good estimate.

I also notice that you win about 10% of your ranked games by your opponent timing out, while you seem to timeout much less. It makes me wonder how much OGS ratings are affected by time outs. Timing out is quite rare in real life tournaments, so it’s not much of an issue there.

2 Likes

Yeah, its not a zero-sum game. Especially since ogs is probably the most beginner-friendly server, it is quite typical for new users to gain strength and “steal” rating points from the dinosaurs who have stopped improving years ago. You know the saying “If you don’t move forward, sooner or later you begin to move backward”?

The slow rank deflation is kinda inevitable, so i guess thats why ogs has had and is going to have rank adjustments in the future too every few years.

I feel you, its a real shame that old rank data is gone :<
I also learned to play go on ogs and was stuck at 30k for months, but according to graph i have never been lower than 13.6k ^___^

2 Likes

I think by Novemeber in 2017 I was 15kyu because I used it to sign up to a Go tournament. I’m not sure at that point I got to go to a club yet, because I’ve signed up to the EGF with a non-existent club, which has been attached to me ever since. In april the chat is putting me down as about 19kyu.

I get the feeling though, that when the retroactive calculations are being applied, it probably presumes that you start at whatever the current default rating is ~1150 or whichever. Now if I was really 25kyu at that time and played against other 25kyus in the old system, it probably just fluctuates both players around that ~10kyu or so. If my win rate was a bit higher it’s pushing me beyond to 9kyu etc.

I don’t think the retroactively calculations made much sense, especially going all the way back. Realistically we should’ve retroactively calculate back only to a point. Like before a bug was introduced, or only as far back as a different ratings update. Of course that wasn’t an option this time

Anyway though

These are pretty good estimates though.

I haven’t actually reached beyond 7 kyu yet in EGF :stuck_out_tongue: I’m hoping to be about 6kyu if I register to some more rated tournaments, maybe this upcoming BGA one.

In recent games I’ve just been cleaning out the 19x19 ladder of inactive people. I’m in the middle and I just challenge pretty much every person ahead of me. Those ones are just getting annulled.

I probably timeout less in correspondence, but much more in Blitz :slight_smile:

2 Likes

IKR, sooooooooooooooo unfair :frowning_face:

I wonder if one could apply some of the recent research in AI evaluation of human games to consistently estimate player strength against a fixed ruler. One could bend some of the recent research (like this paper) to the problem, pretty easily.

Even with a very noisy estimation, evaluating a few hundred games of active players would give a very tight estimate of how much rank is shifting over time, and allow an easy adjustment to be calculated to make them equivalent. Maybe I’ll hack something together this week.

2 Likes

I’m not sure that would work well: it measures how well players would match against the AI, but since over time there have been a lot of new strategies and new joseki discovered, players from different time periods may end up being mismatched not because of lack of skill, but because of unfamiliarity with each other’s style.

For example, any player inadequately handling the 3-3 invasion would be deemed weaker, while it was simply the case that we didn’t know tha adequate way of handling the 3-3 invasion before AlphaGo came along (we still might not, in fact). It doesn’t have to do with strength, per se, but rather with a lack of knowledge.

2 Likes

That’s the core of my question. Of course, current ranks can be fixed with update, but we recalculate them into the past, how can I trust past ranks? OGS is lucky to have players who played for 5 years, for 10 years on OGS. When I see their graph over many many years, how much can I trust that it makes any sense?

If I see a players who’s plateaud for years, are they plateaud or they improve together with deflation. If a player is getting weaker over the years, are they getting weaker due to age or something, or their rank deflates.

It’s a worthwhile question to consider when go is a game where improvement can take years.

3 Likes

I played a couple of games against a friend who showed me the game, at about the same time as the AlphaGo games with Lee Sedol. Then I signed up to OGS, which apparently was in about April of 2016. (I have record of telling someone I went through the interactive way to go and then signed up to OGS and played a few 9x9 games).

Its possible I didn’t use an email when I signed up and then got a new account when I did sign up with email. I don’t quite remember though what exactly happened.

Maybe. I think much of the evaluation of player strength would be based on the less novel moves. AI has introduced a few new key moves, but much of the game still consists of choices outside of those new plays.

And in the first place the question came up from looking at my very older games where I’m supposed to be almost the same rank. And I can see clearly that I learned a lot since then. But I’m almost at the same rank. So I wonder, was this really all for nothing. All the learning and studying of mine didn’t change anything.

So yeah, the question is whether it’s something that can be answered, controlled for.

4 Likes

Come to think of it. Maybe even bigger factor is changing population size. In recent years OGS population is much bigger. Number of games played is twice, thrice the number we had before. It might also have an effect on how ratings are spread out.

I think also just the availability of resources now should make it easier to get stronger.

I know not everyone likes watching go videos and lectures, reading go books, reviewing with ai, but there’s a lot more options out there now and very easily available, and not too expensive in some instances.

One could just rely on playing enough games, and possibly reviews against stronger players of course, but there’s nothing to say that the strong players or opponents won’t have used the newly available material and want to try it out, or use it to explain things in reviews :slight_smile:

That could be a small factor in slowing progress in some cases – we know we’re improving but maybe so are the people that were also around our level :slight_smile: