Is OGS ranking system getting harsher?

graphite_he · November 16, 2024, 10:59am

I vaguely remembered that in 2022,doge_bot_1 used to vary between 2D and 4D,kata_noob was about 1k/1d,and doge_bot_2 and noob_bot was 2k.Despite my skill have improved,winning a 1D seems way easier.I suspect that the ranking system is getting harsher,but I’m not quite sure.Does anybody feel the same?

Groin · November 16, 2024, 11:53am

Are these bots ranked lower today? Did they improve in real strength?

graphite_he · November 16, 2024, 12:05pm

I just checked the ranks.
doge_bot_1 1d
doge_bot_2 4k
kata_noob 2k
noob_bot 2k
I don’t feel any difference in their strength.

shinuito · November 16, 2024, 12:17pm

There’s a number of reasons why their rank might drop down but their strength stays the same.

One effect is that new players might often play bots a lot. For example a new player who is around 1d or so, that wants to quickly rank up might beat up the bots to jump up the ranks quickly.

Every so often there’s people that cheat against the bots. Using an engine is one thing. Another thing is just finding an exploit, like if a bot always plays the same way in an opening and it can’t read a ladder then it’ll lose or you’ll gain a significant advantage every game the same way.

Both those kinds of things can heavily deflate the ranks of bots over time. They can be a few ranks lower than they should be for example.

graphite_he · November 16, 2024, 2:16pm

That makes some sense.Your opinion about exploit inspired me.I’m starting to understand the whole picture.

A way to exploit bots is playing handicapped games against it(if it’s allowed).Choose a bot that accepts handicapped challenge,for example,Echinops or doge_bot_1.If you examine its rating curve,you’ll find that almost all the massive dips are caused by losing several handicapped games in a row.

But that’s not over.When a handicapped game is played in certain conditions,the winner’s gain is less than the loser’s loss,regardless of who wins.This finding is based on my experiment,and I’m planning to write an article about it.

Then consider all the players as a whole.Each time a handicapped game is played,the sum of all players’ rating decreases.The local deflation,which mainly affects the bots and players who mostly play against bot,will spread through human games,eventually causing global deflation.

Groin · November 16, 2024, 2:46pm

Well I got the curiosity to check echinops falling at 2k (its lowest dip) like 3000 games ago, and it had nothing to do with handicap. Furthermore while parsing through the history I didn’t notice abondance of handicap games, and those not being significantly a loss by the bot. One interesting thing is the shape of the curve, with “squale teeth”. Anyway all this would at first be backed by some statistics, not simply a feeling.

Jon_Ko · November 16, 2024, 2:55pm

What would that circumstances be?

shinuito · November 16, 2024, 3:14pm

@jlt mentioned something about rank deflation before from handicap games

I think in that example the person playing with the handicap might lose ranking in the long run if the win probabilities were 50-50.

This was also the OP’s previous thread.

graphite_he · November 16, 2024, 4:10pm

Most circumstances.

Jon_Ko · November 16, 2024, 6:07pm

But I think they aren’t, OGS is not using Proper Handicap.

@graphite_he, did you assume a 50-50 probability?

shinuito · November 16, 2024, 6:09pm

Yeah, but I think @jlt was arguing in that example that blacks winrate would have to be like 80% or something for it to perfectly balance out the gains/losses over time.

Edit:

I didn’t sit down to find out where the problem lies. Was in glicko, the implementation of glicko, my specific translation of the python code into JavaScript for the calculator (not the rating code, just predicting the changes).

It definitely fell out of sync with the GitHub repo though, since I think there was some tweaks to play around with the uncertainty over time, and probability of winning functions and things.

We were discussing some bit of rating math there. For example what the probability of winning should be with glicko.

graphite_he · November 16, 2024, 6:45pm

No.I don’t know whether if I make it clear enough through the following picture.How you assume the probability doesn’t matter.

Jon_Ko · November 16, 2024, 7:22pm

I guess you should use < instead of <=, but I get the point.

jlt · November 16, 2024, 8:10pm

Anyway the vast majority of games on OGS are without handicap, so I don’t think handicap games creates a lot of drift, if any. Many other factors may influence the rating system, like

Entry rank chosen by new players
Sandbagging and other forms of cheating.

The rating system of OGS was designed to be intermediate between EGF and AGA. A survey was made last February

unfortunately it didn’t receive many answers but it showed no evidence that the ranking system had become harsher.

Eugene · November 16, 2024, 10:48pm

Note that any attempts - whether “just experimenting” or not - to exploit bots in ranked games are viewed Very Dimly, and are a likely fast track to account suspension, for the obvious reasons.

_KoBa · November 20, 2024, 9:55am

I’ve also felt like ratings have been getting tougher recently T__T

Maybe its one of those “if you’re not moving forward, you’re moving backward” type of things?

I havent done tsumegos or studied new openings for quite some time, but because some other people do study and are actively getting stronger, my rating is slowly drifting down as the overall level of skill on the player pool around me is improving

Or i dont know, maybe the modern super-strong bots are draining the glicko-points from the rest of us?

Even if i wont play against those bots myself, i still play with people who do play and lose against some version of katago or are also inflicted by the general drainage. Whenever someone loses their rating points to a bot, their next opponent will then either “lose more” or “gain less” points than they should depending on the outcome. Eventually that has to trickle down and affect the rest of the player pool too, right?

Might also be a combination of those things, and probably some other reasons too for the ranks feeling harsher. Or maybe we’re just imagining everything, and im actually the one who is indeed getting weaker due forgetting stuff i had once learend… Who knows xD

Jon_Ko · November 20, 2024, 11:59am

Rank of bots over time is one thing to look at, another thing would be to look at mean point loss per move of players of a certain rank over time (players that were 10k a year ago lost x points per move, players that are 10k now lose y points per move). Of course that’s just an indicator for a drift happening, not a potential proof (it could also mean the players’ strength didn’t change, just how their style is perceived by AI).

I think OGS has some data to do that analysis, but I don’t know how to get it.

Of course we could pick some random games and analyze them with KataGo, but that feels like a waste of resources, given that OGS already does a quick analysis of every game.

graphite_he · November 22, 2024, 1:33am

My rank dropped from 2D to 1D on foxy in May 2023.Meanwhile,I was OGS 1.1k.Now I’m 4D on foxy but 0.9k on OGS.I’ve done about 1000 tsumegos this year.

shinuito · November 22, 2024, 2:36am

It looks like you’re mostly playing bots. If the bots are being under ranked because of various reasons mentioned, then it makes sense that the rank you get is lower because they are lower than they should be say.

If you want you’re OGS rank to go up a bit, play some more human ranked games

I think people have said in various places a 1d OGS EGF AGA could be like 3-4 dan on Fox.

Feijoa · November 22, 2024, 4:45am

One of the reasons I started the nixbot family of bots like gnugo-nixbot and fuego-nixbot was to be able to detect drift in the OGS ranking system over the years. These bots use precisely-defined versions of engine software so their “real” strength - if that means anything - should be mostly fixed over time.

For example, even if I stop running them, you can easily copy them and make your own bots that run the exact same (or different, if you want) software.

Does anyone have a way to graph ranks beyond the 5,000 game history limit?