Overall rank calculation

milesc · February 4, 2021, 6:19am

How is the overall rank calculated? Why should it be lower or higher than the average of each board size rank?

Eugene · February 4, 2021, 6:54am

Overall rank is calculated by doing the Glicko calculation and updating your rank from every game you play.

If you beat someone better than you, you get lots of ranking points. If you beat someone weaker than you you don’t get many.

If you lose to someone stronger than you, you don’t lose many ranking points. If you lose to someone weaker than you, you lose more ranking points.

For overall rank, this calculation is applied regardless of game type or size.

The other ranks are calculated by updating only with games of those size or speed.

Thus those ranks each come from different pools of results, and therefore end up different - if you lose every blitz game you play, your blitz ranks will be low, but your overall rank could still be high if you rarely play blitz and win all your correspondence games, for example.

milesc · February 4, 2021, 5:47pm

Huh. I guess it’s still hard for me to visualize how Glicko on all the games together could produce a totally different value than Glicko on all the games split into three groups.

fmansa · February 4, 2021, 9:07pm

It’s just like taking averages of averages of subgroups. It does not necessarily equals the average of all the involved numbers.

(3+5+7)/3 != ((3+5)/2 + 7)/2
5!=5.5

dragon-devourer · February 4, 2021, 9:18pm

Is there any weighting? For example, does 19x19 contribute more to your overall rank than smaller board sizes? If that is not the case, maybe it should be…

shinuito · February 4, 2021, 9:30pm

Here’s a scenario where they end up different. One player signs up and only plays live games for a while, their overall rating is the same as their 19x19 rating let’s say. Now they decide to play a bunch of blitz games, but their blitz rating is back at its initial value of 1150±350.

Now if they play someone else in blitz their ratings can gain different amount of points, since the ratings gain/loss is usually a function between the difference between the ranks of the two players. This gap could be small for the overall rank (maybe they were auto matched) but the blitz gap could be huge if the other player has played a good bit of blitz before.

milesc · February 4, 2021, 10:39pm

Sorry if I’m being dense, but your answers explain why it is not a straight average, but I still don’t see how the overall (average of averages) can be higher than every single component average.

Eugene · February 4, 2021, 11:42pm

Your wording indicates that you think the overall ranking is being computed from the individual ones.

It is not.

It’s not an average of the other individually calculated results.

It is an independently calculated result, from a different pool of games: the pool of all games played by that player.

I do agree that the mechansim by which the overall ranking can be higher than any of the others is not intuitive, but shinioto’s scenario goes a way to explaining that. Maybe if you re-consider such scenarios, freed from the idea that the overall is an average of the others, the light can come on?

I am pondering how to come up with a clearer example myself

milesc · February 5, 2021, 12:12am

no no, it was made clear in the first response that it is not calculated from the individual ones. I get that. But, a score that comes from all games must bear some relationship to the scores which come from three partitions of all games, and I’m still surprised that it can be outside the range of the scores of those three partitions.

Let me re-apply myself to shinuito’s example.

shinuito · February 5, 2021, 12:15am

I was thinking the same, one way might be just to simplify the system at hand.

Lets suppose we have a ranking system where by your default rank is 10kyu. Now theres two settings for simplicity B and L (live and blitz, why not) and and overall rank O, all computed separately from each other. Only one board size allowed for simplicity like 19x19.

Here’s the simplified rules:

You can only play even games (eg 6.5 komi japanese, or your favorite ruleset)
If you win a game against an opponent who has the same rank, or a 1 rank difference, your rank increases by 1, if you lose it decreases by 1.
if you win a game against a player of 2 or more ranks stronger, your rank goes up by 2 and if you lose against a player of 2 or more ranks weaker it goes down by 2.
If you win against a player 2 or more weaker you don’t change, or lose to a player 2 or more ranks stronger, no change.
To update the ranks of B, L and O you only compare like ranks.

So you start off playing lots of L games, eg lets say we could say you win 5 even games and go up to 5 kyu, then you lose two even to 7kyu, and win against a 5 kyu so back up to 5kyu. All these games used your O rank when pairing players (imagine an automatch).

Where did/can your L rating end up?

If everyone you played had matching L and O ratings, then your L and O rating should match at the end at 5kyu in this funny system.
If however one of your opponets L and O rating differed, eg in the last game (you O-7kyu L-7kyu, opponent O-5kyu L-6kyu) now what? Well the O calc is the same, you got to 5kyu but your L calc bumps you only to 6kyu.

This shows that any imbalance on one player can propagate to other players

In the second case we can already see that the average doesn’t really match (B+L)/2=(10+6)/2=8kyu whereas O=5 and L=6kyu. Maybe there is some funky average that makes it work, but at the moment B is static while you play L games, and O and L can change separately so it seems unlikely that the weights would be fixed to always make the average work out (maybe it does?).

Now lets mix a Blitz into it. You now decide to play 5 Blitz games to get a Blitz rating, all against even opponents. You win 4 and lose 1, how do your ratings look? (there’s loads of ways this can go)

Imagine automatch and it pairs you by O ratings: You win 4 and you’re bumped from O=5 to O=1 and then lose and back to O=2, L=6 still.
If your opponents had matching B and O then you go from B=10,8,6,4,2 as your O=5,4,3,2,1.

many other variations

Actually even at this point (before the loss) you can see that O=1 B=2 and L=6, so O>B,L.

It looks like O>B,L in simple examples

I think it just being possible in a simple system, I’m happy with, so I’m cutting off my analysis for now

I’m trying to see if I can make a simpler version, but something not too dissimilar from ELO or GLICKO.

The idea that you gain more when beating someone much higher rated, lose more when losing to someone much lower rated.
The idea that losing to someone much stronger doesn’t really matter, and beating someone much weaker doesn’t really matter.

I was imagining somekind of cutoff/steps in the rating (in this case ranking), rather than the continuous ELO/Glicko ratings.

Feel free to pick apart the above examples, I’m just kind of throwing it out there on the fly

FritzS · November 29, 2021, 11:24pm

I thought about this recently and it would be grateful if someone could confirm my hypothesis:

Let’s assume you play a 19x19 game with “Normal” time controls.
After finishing that game, the following four (entirely) different ratings (out of the 16 you have) are updated for you and your opponent:

The 19x19 rating for time control “Normal” (obviously)
The overall rating for 19x19
The overall rating for “Normal” time control
The “total” overall rating (for all board sizes and all time controls)

The calculation for each of the four rating categories is done in exactly the same way and the calculation for any of the four categories is based solely on

the current rating-parameters of your opponent and of yourself of that very category and
the result of the game

…and is in no way based on data of any of the other three categories.

Is this correct?
Thanks!

BHydden · November 30, 2021, 12:02am

This is correct, as are the 4 categories you identified.

Based on the above comment though,

all 4 categories listed above are compared with your opponent’s overall rating (and likewise their 4 categories are compared with your overall)
the result of the game

This is how it now works as far as I understand it, meaning they are not as distinct and incomparable as they once were.

Groin · November 30, 2021, 3:24am

There is no weighting. Read that It’s no big change and still consistent.
In terms of time invested i would think that weighting will be more fair.
Now thanks to this, you get a quicker introduction in the rating system by playing 9x9

gennan · November 30, 2021, 7:47am

Interesting, so the category ratings are not that relevant anymore. The overall rating is actually used for everything and the most important, while the category ratings for board size and time settings are mostly just for reference, if you happen to be interested in breaking down your overall performance into those categories.

BHydden · November 30, 2021, 8:25am

Yep exactly.

_KoBa · November 30, 2021, 2:15pm

Well not only Your performance (tho getting stats about your own games is always interesting!) but the breakdown is there so you can quickly asses other ppl’s preferences. There are ppl who play (almost) fully on small boards, some people play mainly quick blitz games while others are correspondence players… Quick peep at your profile tells me that you only play 19x19 games with live/corr timesettings ^___^

FritzS · December 1, 2021, 8:23pm

Thanks for all the responses and explanations!

I doubt I like the method though.

This means that in my example

the 19x19 rating for time control “Normal”,
the overall rating for 19x19 and
the overall rating for “Normal” time control

are all updated based on the total-overall rating of the opponent. In each of the three calculations the corresponding changes to the total-overall rating of the opponent are not applied.
The same is true when those three ratings of your opponent are updated based on your total-overall rating.
Only when your total-overall rating is calculated against the total-overall rating of your opponent, both total-overall ratings are adjusted.

I would be very surprised if that approach wouldn’t have lots of unexpected (and unwanted) implications to the ratings.