I vaguely remembered that in 2022,doge_bot_1 used to vary between 2D and 4D,kata_noob was about 1k/1d,and doge_bot_2 and noob_bot was 2k.Despite my skill have improved,winning a 1D seems way easier.I suspect that the ranking system is getting harsher,but I’m not quite sure.Does anybody feel the same?
Are these bots ranked lower today? Did they improve in real strength?
I just checked the ranks.
doge_bot_1 1d
doge_bot_2 4k
kata_noob 2k
noob_bot 2k
I don’t feel any difference in their strength.
There’s a number of reasons why their rank might drop down but their strength stays the same.
One effect is that new players might often play bots a lot. For example a new player who is around 1d or so, that wants to quickly rank up might beat up the bots to jump up the ranks quickly.
Every so often there’s people that cheat against the bots. Using an engine is one thing. Another thing is just finding an exploit, like if a bot always plays the same way in an opening and it can’t read a ladder then it’ll lose or you’ll gain a significant advantage every game the same way.
Both those kinds of things can heavily deflate the ranks of bots over time. They can be a few ranks lower than they should be for example.
That makes some sense.Your opinion about exploit inspired me.I’m starting to understand the whole picture.
A way to exploit bots is playing handicapped games against it(if it’s allowed).Choose a bot that accepts handicapped challenge,for example,Echinops or doge_bot_1.If you examine its rating curve,you’ll find that almost all the massive dips are caused by losing several handicapped games in a row.
But that’s not over.When a handicapped game is played in certain conditions,the winner’s gain is less than the loser’s loss,regardless of who wins.This finding is based on my experiment,and I’m planning to write an article about it.
Then consider all the players as a whole.Each time a handicapped game is played,the sum of all players’ rating decreases.The local deflation,which mainly affects the bots and players who mostly play against bot,will spread through human games,eventually causing global deflation.
Well I got the curiosity to check echinops falling at 2k (its lowest dip) like 3000 games ago, and it had nothing to do with handicap. Furthermore while parsing through the history I didn’t notice abondance of handicap games, and those not being significantly a loss by the bot. One interesting thing is the shape of the curve, with “squale teeth”. Anyway all this would at first be backed by some statistics, not simply a feeling.
What would that circumstances be?
@jlt mentioned something about rank deflation before from handicap games
I think in that example the person playing with the handicap might lose ranking in the long run if the win probabilities were 50-50.
This was also the OP’s previous thread.
But I think they aren’t, OGS is not using Proper Handicap.
@graphite_he, did you assume a 50-50 probability?
Yeah, but I think @jlt was arguing in that example that blacks winrate would have to be like 80% or something for it to perfectly balance out the gains/losses over time.
Edit:
I didn’t sit down to find out where the problem lies. Was in glicko, the implementation of glicko, my specific translation of the python code into JavaScript for the calculator (not the rating code, just predicting the changes).
It definitely fell out of sync with the GitHub repo though, since I think there was some tweaks to play around with the uncertainty over time, and probability of winning functions and things.
We were discussing some bit of rating math there. For example what the probability of winning should be with glicko.
No.I don’t know whether if I make it clear enough through the following picture.How you assume the probability doesn’t matter.
I guess you should use < instead of <=, but I get the point.
Anyway the vast majority of games on OGS are without handicap, so I don’t think handicap games creates a lot of drift, if any. Many other factors may influence the rating system, like
- Entry rank chosen by new players
- Sandbagging and other forms of cheating.
The rating system of OGS was designed to be intermediate between EGF and AGA. A survey was made last February
unfortunately it didn’t receive many answers but it showed no evidence that the ranking system had become harsher.
Note that any attempts - whether “just experimenting” or not - to exploit bots in ranked games are viewed Very Dimly, and are a likely fast track to account suspension, for the obvious reasons.
I’ve also felt like ratings have been getting tougher recently T__T
Maybe its one of those “if you’re not moving forward, you’re moving backward” type of things?
I havent done tsumegos or studied new openings for quite some time, but because some other people do study and are actively getting stronger, my rating is slowly drifting down as the overall level of skill on the player pool around me is improving
Or i dont know, maybe the modern super-strong bots are draining the glicko-points from the rest of us?
Even if i wont play against those bots myself, i still play with people who do play and lose against some version of katago or are also inflicted by the general drainage. Whenever someone loses their rating points to a bot, their next opponent will then either “lose more” or “gain less” points than they should depending on the outcome. Eventually that has to trickle down and affect the rest of the player pool too, right?
Might also be a combination of those things, and probably some other reasons too for the ranks feeling harsher. Or maybe we’re just imagining everything, and im actually the one who is indeed getting weaker due forgetting stuff i had once learend… Who knows xD
Rank of bots over time is one thing to look at, another thing would be to look at mean point loss per move of players of a certain rank over time (players that were 10k a year ago lost x points per move, players that are 10k now lose y points per move). Of course that’s just an indicator for a drift happening, not a potential proof (it could also mean the players’ strength didn’t change, just how their style is perceived by AI).
I think OGS has some data to do that analysis, but I don’t know how to get it.
Of course we could pick some random games and analyze them with KataGo, but that feels like a waste of resources, given that OGS already does a quick analysis of every game.
My rank dropped from 2D to 1D on foxy in May 2023.Meanwhile,I was OGS 1.1k.Now I’m 4D on foxy but 0.9k on OGS.I’ve done about 1000 tsumegos this year.
It looks like you’re mostly playing bots. If the bots are being under ranked because of various reasons mentioned, then it makes sense that the rank you get is lower because they are lower than they should be say.
If you want you’re OGS rank to go up a bit, play some more human ranked games
I think people have said in various places a 1d OGS EGF AGA could be like 3-4 dan on Fox.
One of the reasons I started the nixbot family of bots like gnugo-nixbot and fuego-nixbot was to be able to detect drift in the OGS ranking system over the years. These bots use precisely-defined versions of engine software so their “real” strength - if that means anything - should be mostly fixed over time.
For example, even if I stop running them, you can easily copy them and make your own bots that run the exact same (or different, if you want) software.
Does anyone have a way to graph ranks beyond the 5,000 game history limit?