Re: Rank Deflation - Bot Ranks

I took the most respectable OGS bot - GnuGo and plotted its rank over time. Looks how it goes down. Coincidence? Don’t think so.

1 Like

I thought gnugo was turned off for having too many bugs?

new rank system recalculation of entire history makes graph useless
people who were DDK many years ago and now are SDK look like they always were SDK

5 Likes

there should be some sort of coefficient - so bot that always play same strength has horizontal graph, not decreasing graph.
And people who improve through years should have increasing graph, not horizontal.

2 Likes

But it’s the other way around, retroactive recalculation is the exact thing that makes the graph useful. Recalculation wasn’t done by some magical separate method, it was calculated in the same way we calculate ranks now. So by examining the past we could see the future, tu fui ego eris.

Bots are very convenient. Barring updates and changes they should play the same level of go, and they play a real lots of games so their “rank distribution” should be quite accurate. Why the line is decreasing then? Because ranks are deflating, and by looking at bots we could find a necessary adjustment for rank deflation. That’s the point of the orange line there.

In original GnuGo graph the coefficient is -0.117 rating points per day. Or -42.7 points per year or -213 points per 5 years.

Amybot beginner is more stable, if you throw away two weird month at the beginning and the middle there, the slope is only -0.035

Summary

1 Like

OGS uses system that works good in comparing 2 persons who both played recently.
But, you can’t compare your rank now and your rank 4 years ago - same as you can’t compare ranks between separate servers

purpose of graph is to compare present and past, which current system is unable to do for anyone who improve slow enough.

1 Like

Nice graph!
This (most likely) shows the major weakness of the Glicko rating system (compared to Elo): It drifts over time.
If you use Glicko, my recommendation would be to actually implement a way to counteract the drifting. Using players with constant strength (like bots) could be used for that.
I’m not aware of any game server that is actually implementing any such technology though. Unfortunately.

Bots would also be a great means to “calibrate” ranks/ratings between different Go servers. Bots of different strengths playing on different servers would provide great data to do such calibration. Wouldn’t it be fantastic if one day DDK, SDK, 1 dan, … would mean the same thing (i.e. strength) all over the world?!

2 Likes

KGS had (has?) a anchors system.
All other the world or all other the internet?

1 Like

Interesting. Didn’t know that.
KGS anchors Nice!!!
I would make this high priority for OGS right away.

I would be aiming for: All over the galaxy! :slight_smile:
I really mean it.
I dislike it a lot that Go ranks do mean so different things.
If “one meter” would mean different distances depending on who I’m talking to, it would be equally frustrating for me.

2 Likes

Problem is that we don’t talk maybe of the same thing. Players on internet may have different abilities as IRL.
Besides there is a problem of reliability with internet which make federations reluctant to integrate these rating.
Note: a bit OT and there is topics discussing on this already.

1 Like

Amybot adjusts its strength to keep its OGS rank stable, therefore we don’t see deflation there.

3 Likes

This means Amybot gets stronger over time. Cool. And weird somehow.
Would it be an option to keep the bot’s strength constant and adapt the ratings (of all players) instead? Keeping also Amybot’s rank stable by doing so?

I see, that makes sense. Other Amybot is also stable.

Summary

Even with bots it’s not necessarily that their strength is constant. So anchors are hard to choose.

Maybe we could train AI that would determine ranks. And just it through history.

Surely GnuGo strength is constant. No AI, only programmed play that never changes, never learns, never improves, etc. Seems like the ideal anchor…

…that is, if you believe that 6k (GnuGo rank) today should be the same strength as 6k many years ago and 6k many years from now; and that rank deflation is not a natural and expected part of life.

For example, 100s of years ago, few people could read; now, most people can read. That is “rank deflation” of reading ability. “I can read” was once a sign of being well educated - not so any more. So, as the general standard of Go gradually improves over time as the best become better, maybe a 30k (absolute beginner) then is the same as a 30k now but a 9p (top pro) now is better than a 9p then. So there is some stretching out of the ranks (like some sort of cosmic expansion :laughing:) such that you have to be better now to be 6k than you had to be then. Hence GnuGo rank goes down over time.

1 Like

I am not sure. Most progress are about opening, old masters were tough in Yose because they had time to figure it at best. With time constraints, even in middle game, simplification is required today.

Interesting… :face_with_monocle: :smiley:

But that still supports the point that a particular rank is not necessarily the same strength as the same rank from disparate time periods.

2 Likes

% of pro moves that are identical to neural bot moves was always increasing, even before AlphaGo
not only fuseki, midgame, endgame too
so average strength of pro is increasing with time.

2 Likes

Even a bot that plays exactly the same may not have a constant real strength. The general Go playing population can improve. But, if the population just changes their playing style to something that GnuGo handles slightly more poorly, you could see a decrease in its rank even if the average playing strength of the population has not changed. Maybe this is caused by the change in playing style brought on by AI. It’s also reasonable to assume that people have gotten slightly better at Go on average.

3 Likes

Humanity started to play Go differently after AlphaGo
% of strong and % of weak players is different on OGS now than was 4 years ago

BUT

Graph is still useless. It doesn’t look like something increasing for people who improved in these 4 years. Its possible to explain everything, but its better to actually change something.

As I said earlier in another topic, Fuego is weird. Look at that bump in strength.

Summary

And it’s not like a one-time thing, took several months to go up and then go down.