2021 Rating and rank adjustments

The graph (reposted below) shows that the Elo width per rank of adjusted EGF and adjusted OGS ranks is very similar in the 20k-5k range (about 50 Elo per rank). So the Elo distance between 20k and 5k will be similar in EGF and OGS (resp. ~700 Elo and ~600 Elo).
Above 5k, the adjusted EGF ranks are a bit wider, so the Elo distance between adjusted EGF 5k and EGF 1d is bigger than the Elo distance between OGS 5k and OGS 1d. But again, the difference is not that big (resp. ~350 Elo and ~300 Elo).

image

Since the OGS adjustment, I have seen examples where it seems to have resulted in kyu ratings that are much higher than what will be the result of the EGF rating adjustment. But this is only anecdotal. I don’t know if the OGS rating adjustment really inflated OGS kyu ranks compared to EGF/KGS/AGA kyu ranks.

If the OGS adjustment really caused a kyu rank inflation, I’m guessing that the handicap analysis used for the adjustment might play a role. I did note that there is a differerence in how OGS and EGF handle handicaps in rating calculations. OGS doesn’t correct for ½ stone handicap deficit in traditional handicaps. See specifically my post 370 in this thread, where I gave an example for how this could affect the calculation of Elo widths of ranks (and thus how ranks are extrapolated downwards from the 1d anchor):

But I don’t know if this has anything to do with anything. I’m mostly guessing here. And the resulting Elo gap between adjusted OGS 1d and OGS 20k (~900 Elo) seems quite close to the same gap in the adjusted EGF system (~1050 Elo), so I cannot really explain how the adjusted OGS kyu ranks can be inflated by more than about 3 ranks at 20k, compared to adjusted EGF kyu ranks.

2 Likes

Why are you constantly assuming things that are not happening?

First, it is impossible tor erroneous data to improve a computed measurement. Second, weren’t the split ranks Blitz/Normal/Correspondence? That is very different from using 9x9 statistics to build a 19x19 ranking.

Again, could you show us that the data is erroneous?

This whole discussion doesn’t make sense.

It feels to me like you’re begging the question: you’re unsatisfied with your rank, therefore the ranking system must be incorrect. Instead try looking at it from the other side: do you have any evidence other than your own dissatisfaction that you’re incorrectly ranked? Are you losing / winning more than you are supposed to against opponents of the same rank (assuming you play similar games as with the old system)?

5 Likes

Using all boardsizes slightly improves the accuracy (number of correct predicted outcomes). At least it has no negative effect. Stuff is tested with 12M ranked games (most of OGS game history)

5 Likes

How can this be possible, if everyone’s rank is incorrect in the same way? The test of ranking is not what you suggested, it is to compare the ranks of OGS users with external rankings, where possible. I suggested this above, and it’s been completely ignored. The practicality of comparing ranks is questionable in my case, since I play only on 9x9 boards, and I’ve never been properly ranked, either for 9x9 or 19x19. But this isn’t a problem, since I’m only one person. Surely there exist a few mid-kyu players among the thousands here who have been properly ranked in some external environment on 9x9 boards! Find those players and compare their ranks with the new OGS ranks. Such a scientific measurement is the best way to end this endless debate. Until someone actually does this, I feel certain in my belief that 9x9 statistics are no good for generating 19x19 ratings.

But, that’s what happened: our 1d aligns with AGA & EGF 1d.

And for the rest, we apply the several ages old way of seeing difference in rank as number of handicap stones in 19x19 games.


9x9 statistics are not used to compute 19x19 ratings: 3 handicap in 9x9 does not mean 3 stones, but rather something like 1 stone with some reverse komi. These numbers (which is what OGS considers the “3 handicap stones” equivalent for 9x9) are used in the computation, as far as I understand what happened.

It’s even explicitly mentioned in the last section of the top post that the separate rank pools for 9x9 and 13x13 and the different timing settings are taking into consideration, and that effort has been put into making them align as best as possible with the overall rank.

And the goal is not to be the same as every other system: that’s impossible, every system is different. Should we align with AGA? or with EGF? or with Tygem? Or with KGS? There’s no good choice there.

The goal of the ranking system is to be a good predictor for match ups and additionally to be a helpful tool to establish how much handicap is needed for a fair game.

The current ranking system is designed to have the above two points as goal. As long as it succeeds in that, it is a successful ranking system.

But this “scientific measurement” has been done, but not after the ranking system was changed, but before the ranking system was changed. It has been changed exactly to align with these scientific measurements.

The one who doesn’t see this, and keeps this debate from ending, is you.

And for the last time, if you’re convinced that such scientific measurement is going to show that our ranks are incorrect, then do such measurement and show us the results yourself before you start claiming nonsense, please.

11 Likes

External rankings are not mutually consistent either. There is no such thing as an absolute 1 dan, 9 kyu, or 20 kyu player. If you took 1 dan players as ranked by China, Japan, AGA, EGF, KGS, OGS, etc., you’d find differences in strength across these groups.

See for example: Rank - worldwide comparison at Sensei's Library

The recent rating system update is meant to move OGS ranks to be closer to external rankings, however, it is still impossible to be fully consistent with all of them, since all of the various external ranking systems are not consistent with each other.

The best that any system can do is to be self-consistent, so a ranking like 9 kyu should not mean too much in an isolated absolute sense, but rather just signify that a player should be roughly the same strength as other players that are called “9 kyu” in that system.

7 Likes

Sure, you can say that rankings don’t matter at all, that “9 kyu” doesn’t have to mean anything at all with respect to playing games of go. But how does that help a ranking system actually do its job better? If I’m 9k at OGS, I should be about equal with 9k players everywhere, otherwise ranking is meaningless.

No, none of the ranking systems have this property, since what you’re describing is just not reality: the ranks at different places cannot be compared to one another directly.

You can only compare rank within one ranking system. An OGS 9k means something different from an AGA 9k, or a WBaduk 9k. The same was true with our old ranks. For example, according to the 2018 poll on ranks in different servers / associations, an (old-system) OGS 14k was equivalent to a 4k in Japan, or a 16k on WBaduk and a wide variety of things in between.
Not even the relative strength is consistent between servers / associations: the difference between an (old system) OGS 12k and an OGS 2k, is as much as 16 ranks on Tygem, yet as little as 5 ranks in Japan.

But a rank being “about equal everywhere”, that’s not what a ranking system is for, it’s not to create a global indicator of your strength, for every server / association.
That does not mean that it meaningless, however. Instead, like I said, its job is to be a good predictor for match ups (within its scope, in this case OGS) and additionally to be a helpful tool to establish how much handicap is needed for a fair game.

That’s all a ranking system is supposed to do. And our ranking system is doing that job fine, as far as I know.

9 Likes

I think you’ve missed my point. Ranking is meaningless. At least, it does not have the absolute meaning that you expect.

In fact, let’s pretend that OGS does not exist, and just look at the rest of the world. Players that are 9 kyu on KGS are not the same strength as 9 kyus on IGS/Tygem/Foxy, and those are not the same strength as 9 kyus from the AGA, EGF, Japan, China, etc. Each ranking system is largely independent. Maybe some of them make occasional tweaks be to be more consistent with another, but it’s impossible to get them all to agree. There’s no universal reference to say exactly how strong 9 kyu should be.

Instead, what ranking systems try to do is only measure the relative skill between players within a given population. If you have a group of players that are all roughly the same strength, it does not matter much if you call all of them 9 kyu or 9 dan or 29 kyu or 37 fnord, as long as the ranking system is putting all them at roughly the same rank. To extract some absolute meaning from what a particular rank might mean within a system, that system would need to be calibrated according to some external reference. The recent changes in the OGS ranking system intends to (and probably does) move the OGS ranks closer to the AGA or EGF ranks.

7 Likes

I’m not sure I understand what you mean about this waiting? You only just learned about this change a couple of days ago?

2 Likes

It was both, prior to the implementation of “overall rating” we rated each combination of blitz/live/corr and 9/13/19 boards, just as you see now on your profile page (only then those ratings functional, and now only cosmetic)

Obviously has access to a Hyperbolic Time Chamber

2 Likes

Couple of days is generous, that post was published 25 hours ago…

Couldn’t some of you around 7k~11k rank played some 19x19 games with david265 using proper handicap with the current rank system and see how they fair? Even higher rank like 4k or higher could play some high handicap games and test if they are reasonable (which I have some doubt for higher handicap above 5 or 6)

2 Likes

Well David seems to have no desire to actually play 19x19…

In that case, maybe just 9x9 and see if its rank is fairly adjusted. Although I am not sure about what automatic 9x9 handicap is currently set at.

Well, I’m 8k, happy to volunteer for a game or two even though I have no idea what the point would be. But even better, how about we have a tournament? Who’s up for it?

5 Likes

That’s what I really don’t understand about this whole discussion – when someone has no desire to play in that category anyway, why on earth are they complaining about that their rank in that category might be incorrect? When it’s all just hypothetical?

5 Likes