2021 Rating and rank adjustments

What do you think we have been doing???

You really don’t need to read the whole thread. But you should to try to understand ranking systems in general, before making the types of claims you have been making. Read up on elo, the dan system (you don’t even need to read about Glicko specifically). Figure out how you would anchor a system like either of those. Then ask yourself if any of the things you are asking make sense.

In this case, you are asking us to tie 9x9 to the AGA and EGF systems, both of which have very few ranked 9x9 games to go off of. Then you ask to tie it to what 9x9 players think they are.

But what is this exactly? What they think they are on the 19x19? Or on the 9x9? If someone is only plays 9x9 while on OGS, can we assume they are worse at 19x19?

And for the love of God, we have this!!!

5 Likes

I believe there exists some players that inherently play randomly enough (even when completely sober) such that their natural performance could be described as having a random number of beers as drawn by a die roll.

Perhaps the vast numbers of correspondence players diving into more live games for the WDC provides ample evidence to support this hypothesis.

4 Likes

I take personal offense with that comment!!!
:joy::joy::joy::joy:

[but trying hard to learn!!]

3 Likes

Holy shit!! I just visited your profile!!!
WTF?!! How did you do that?!

3 Likes
4 Likes

Thank you for being patient with me. I understand what you are saying. I was, indeed hoping for a universal approximate agreement with what a given ranking “means” in terms of universal playing strength. I wish such a thing existed, but I understand that it does not.

By the way, the word is “monotonic”, not “monotonous”, unless you were making a nice pun.

1 Like

Thank you, I think I get it now. I was hoping that I could tell anyone interested my ranking, and they would have an idea of how strong a player I was.

Now I understand that ranking is only good for matching players at one venue.

7 Likes

Thank you, it is hard to make sense when one’s beliefs are wrong. I have always believed (possibly for more years than you have been alive) that one’s Go ranking is an indication of playing strength anywhere, at least on 19x19 boards. I now understand it’s all relative, that experts consider such a “strength” ranking as absolutely impossible to attain, for some reason that I may never understand.

3 Likes

Sorry, you are right!! that was a false friend with my native language :sweat_smile:

2 Likes

They made sense to me based on my background just playing Go games. Thanks to everyone’s patience I think I understand my mistakes.

5 Likes

If the OGS’s devs don’t mind I’ll try to make a layman’s summary of how rankings work for my own edification.
Please correct me where I’m wrong (which I’m sure I am).

First, We don’t care about kyu/dan. What we want is to give each player a number so that she/he can be matched with players with a similar number. I’ll call this the OGS number

At registration every user is given the same number: 1150. Then players play among them. Their OGS number goes up/down if they win/lose against stronger/weaker players.

Reasonably:
If you win against a stronger player, your number goes up
If you win against a weaker player, your number goes up but not so much
If you lose against a weaker player, your number goes down
If you lose against a stronger player, your number goes down but not so much

This setup will naturally sort every player after a number of games, resulting in an “empirical ranking” for the players.
Notice that this “algorithm” simply sorts the players according to win/lose.

Now, since there are different board sizes and time settings, each player actually have 9 OGS numbers instead of just one (a number per each board size and time setting) so when you win/lose a 9x9 blitz game only one of the numbers goes up/down. When you win/lose a 19x19 correspondence game another number goes up/down and so on.

One can make an overall number by taking the average of these 9 numbers (maybe weighted by the number of games in each type?)


Now, most players want to use the kyu/dan system because it’s cool and that’s what tradition wants us to use. So we need a way to map our distribution of OGS numbers to the 30k-9d scale. There are many many ways of doing this. For instance I can think of a couple of possible mappings:

  1. The strongest OGS player, with OGS number=X is 9d. The worst player, with OGS number=Y is 30k. If you want to know your ranking, take your number, compare it with the best(X) and worst(Y) numbers, and see where you are in the scale (this would be a linear model).
    Notice that we are randomly assigning the best player the 9d ranking (what if the best OGS players is considered a 20k by the rest of the GO world??!!)
  2. Some players play also in X external tournament, so go look their ranking in that tournament and use those relations to classify the rest of the players. This would be similar to the previous strategy, but instead of fixing only the best and worst OGS numbers, you would have some other fixed numbers in between.
  3. Any other function that makes anyone happy.
  4. OP’s initial detailed explanation which statistically matches the low dan rankings with those of AGA/EGF and sets the mapping such that handicap stones in 19x19 are meaningful.

Notice that the mapping is somewhat arbitrary, and the only “measure of your strength” is the OGS number which compares you to other players in the server.
The mapping simply takes a OGS number and returns a kyu/dan number, regardless of the meaning of the OGS number inputted (9x9, 19x19,…).

Finally, as in any other server, the kyu/dan number provided by OGS it’s just your OGS ranking… play in other servers and compare… the differences among servers are not negligible!!

5 Likes

Good, detailed explanation. One nitpick- I don’t think this is entirely correct:

I believe the overall calculations are calculated separately from the 9 sub-groups, not an average (assuming you mean a weighted mean)

2 Likes

weighted mean is correct (in my language “average” and “mean” mean exactly the same so I mix them up all the time :sweat_smile: )
I don’t fully understand your objection. I don’t disagree… I just don’t understand what you mean.
How are the number next to the user name or the number in the upper left side of the table calculated?

1 Like

Old thread, but my understanding is that you just feed all games into the glicko algorithm to get the overall rank. The result may be different than a weighted average of the sub-ranks. It may have changed, but I haven’t heard about that.

In psuedocode, the difference is this (only small and large boards for simplicity):

// in this example, [a,b,c] are game results from
// a "small" board, and [d, e, f] are game results 
// from a "large" board

average = (glicko([a, b, c]) + glicko([d, e, f])) / 2

overall = glicko([a, b, c, d, e, f])
6 Likes

oooooh I see!! great!! good to know!! :+1:
Thanks!!

2 Likes

Are you sure?
Glicko (and Elo) even allow you to calculate the probability of winning from the rating difference.
So for my understanding these ratings clearly do measure your strength.

Or am I missing something here??

3 Likes

I would like to add to that:
From the difference of your OGS number to another player’s OGS number you can calculate the probability of winning. You can call that probability the “expectation” of the game.
If two players with equally stable OGS numbers play many games against each other and their game results do converge against that expectation, their ratings won’t change.

2 Likes

I think what @n0w3l meant to say is that the ratings system do not measure strength on some sort of absolute scale, since it is entirely arbitrary what something like “9 kyu” means in isolation. However, while ratings system do aim to measure relative strength within a system (comparing two players rated under the same system), it is difficult to compare two players rated under separate rating systems.

5 Likes

Sure, there is no absolute scale. And @n0w3l also lists that fact separately as his first point.
If you have different rating systems (that both tell you the probability of winning based on the rating difference), it’s very easy to align them if you have at least one player who would play in both systems more often and who would win and lose in both systems often enough.

If tomorrow aliens would land who would also play the game of Go, all it would take is a number of games with some of them (with enough wins and losses) to align their rating system with ours.

2 Likes

Assuming there is a reasonable amount of overlap. If they are all 60k, we will never know because a loss is a loss :pensive:

2 Likes