Basic rank maths questions

Can someone tell me:

  • What is the rank impact to a mid-range 1d at OGS if they lose to a totally new player (who is 6k with high deviation)?

  • How many such games are required to demote the 1d to 1k ?

Thanks!

5 Likes

I can’t answer those questions for OGS, but I can answer them for the EGF rating system (for comparison, even though this is like compares apples and oranges).

The EGF rating system uses declared ranks for players new to the system. It does not have provisional ranks. So I take just 6k for the opponent. The impact is almost the same if the opponent is much lower rated than 6k.

  • The 1d would lose about 1/6 rank, if the time control is at least 60 minutes + 20 seconds byoyomi. Faster games would have progressively less impact.
    I calculated this with the rating calculation tool: GoR calculator | E.G.D. - European Go Database using 2100 GoR for 1d, 1500 GoR for 6k and tournament class A.

  • After 3 such games, the rating of a nominal 1d would drop below the 1d threshold. After 3 more such games he would have a nominal 1k rating.

5 Likes

Thanks … but I think this doesn’t take into account rank-deviation (aka uncertainty)?

So 3 games for demotion is fine if they are losing to a definite 6k.

But what about a very uncertain 6k?

This is at the key of the question: does it damage a dan to play a new OGS member who is in fact 7d but starts at 6k +/- 9k (or whatever our numbers are).

My feeling is that the answer is “no it doesn’t, because someone with an uncertainty of +/-9k will not make a dent in anyone if they defeat them”.

1 Like

A player with a rating of 1960.38 (I think this is 1.5d) and RD of 60 will fall down to 1948.25 if loosing against a new player (1500, 350). The rating against an established 1500, 60 goes down to 1941.00

The threshold for 1d is 1918.49.
0) 1960.38, 60

  1. 1948.25, 60.69
  2. 1935.94, 61.36
  3. 1923.46, 62
  4. 1910.83, 62.62

=> 4 losses each against a new player

If they are on the border to 2d it would be
0) 2003, 60

  1. 1990.77, 60.72
  2. 1978.15, 61.41
  3. 1965.35, 62.07
  4. 1952.35, 62.72
  5. 1939.21, 63.34
  6. 1925.91, 63.94
  7. 1912.49, 64.51
7 Likes

Not sure if this is what you are looking for and if it is still up to date, but Documentation & FAQ gives some info and links to more info about rank maths:

http://www.glicko.net/glicko/glicko2.pdf

1 Like

Perfect - thanks so much for doing that work for me … I had a feeling there would be someone who could just nail it like that :slight_smile:

The effect of the new-player-RD is suprisingly low.

I thought that given we were saying “we have no idea what the rank of this person is”, it would minimise the rank impact of the loss much more significantly than that.

Just goes to show you can’t guess these things :open_mouth:

4 Likes

IIRC there is a special rule so that nobody loses rating if they lose to a ? player… @anoek may need to confirm that :thinking:

new feature:
https://online-go.com/rating-calculator


I think it makes sense to add “predicted win probability”

7 Likes

Cool, but what does it mean?
Some statistical explanation is needed.

It’s inside the Glicko2 paper you linked above, under step 3: E(μ,μj,φj) is the expectancy that you win with rating μ against a player with rating μj and deviation φj.

Expectancy works like an average, so if the result is, say 0.7, then with the current knowledge the best estimate is that you’d win 7 out of 10 games against the opponent.

3 Likes

I know. But wouldn’t it be nice for all users who consult this new feature to know what it is all about?
Is it a good idea to add a new paragraph to the Documentation & FAQ about it (preferably in English that is also understandable for the not statistically trained)?

I don’t mind writing it.

It’s very hard to explain how or why Glicko works for the not-statistically trained, and I’m not sure if an explanation of the details would be insightful. But if it needs an explanation, then probably with a lot of examples.

I guess the main important points are:

  • there’s Glicko rating (1500±80) and kyu/dan ranks (3.5k±0.8), but they mean the same thing on OGS, like how 250 liters and 66 gallons mean the same thing, the Glicko numbers are easier to compute with, but many people use kyu/dan because they’re more used to it.
  • The higher the first number, denoting the mean, i.e. 1500, the stronger the player. The higher the second number, denoting the deviation, i.e. ±80, the more consistent the player’s strength. That is, it is harder to predict whether someone with high deviation wins a game than it is to predict whether someone with low deviation wins.
  • Winning games will increase the mean, and losing games will decrease the mean. But, if the win (loss) is in line with expectations, the increase (decrease) will be smaller than if the win (loss) is surprising. So, a 1500 player with winning against a 2000 player is surprising, thus the mean of the 1500 player will increase a lot. The same 1500 player winning against a 1000 player will result in only a marginal increase.
  • Winning a lot of games against players with a higher mean and losing a lot of games against players with a lower mean is surprising, and thus such outcomes will increase the deviation. Winning a lot of games against players with a lower mean and losing a lot of games against players with a higher mean are expected, and thus such outcomes will decrease the deviation.
  • For those familiar with Elo rating, Glicko is basically the same thing but with uncertainty margins.
2 Likes

Indeed. I think the new player uncertainty should be MUCH higher, as in 2000 not 350, so that it extends a good way to 30k which is what many of these new players are (and not so many 7ds). I would want a real 1d to need to lose something like 20+ games against a high uncertainty new player to demote a rank, I’ll leave the Maths to someone else.

If the uncertainty is supposed to be the standard deviation, that’s saying the rating system is happy to make a ~68% chance bet the new player is within 1500 +/- 350. I would bet against the rating system and make a massive profit. It’s wrong.

1 Like

The maths says that the 95% confidence interval of the player’s rank should lie in the interval of 2 deviations apart from the mean, so with a RD of 350 and starting mean of 1150, that means that 95% of the starting players will have a “true” rank in the interval 450 to 1850

Judging by this:

the actual 95% interval lies between 26k and 2d, which is somewhere about 550 to 2000. I guess this is different from the interval above, because the rank distribution is not a bell curve, but is skewed towards stronger players. I’m not actually sure, and I don’t actually like statistics enough to figure it out.

In any case, if anything, we should keep the RD the same and move the mean of starting players up a bit to get the actual 95% interval.

3 Likes

I’ll have my best shot.

And by posting what I did, I will probably get some feedback.

Dus je weet nooit hoe een koe een haas vangt :rofl:

2 Likes

Here is my best shot. :innocent:

My intent is to supply info on a tool (rank calculator), so that it understandable for everyone (and not only the nerds).

I also share @Vsotvep’s concern and worries whether it is insightful and necessary.
But nonetheless …

Suggestions, feedback, etc?

3 Likes

You’re only considering players who stick around for a while and play a bunch of games, right? Surely we have a lot more entering with almost no knowledge of the game and either improving rapidly or leaving before they get a rank.

3 Likes

I’d say it’s a bit confusing, but also not statistically accurate, because again, statistics is quite complicated.

Deviation & volatility

In general deviation refers to the departing of a accepted set of behaviour. In statistics, deviation is a measure of difference between the observed value of a variable and some other value, often that variable’s mean. Volatility refers to liability to change rapidly and unpredictably.

While this is one interpretation of the word “deviation”, it is not the correct one here. Here it stands for the standard deviation of a normal distribution. The system is trying to guess what the rating of a player is, but it cannot be sure, so the deviation gives a measure of how certain it is: out of 100 players with the same rating and deviation, the system thinks that 66 of them have an actual rank within 1 deviation away of the given rating, and 95 are within 2 deviations away of the given rating.

To be really correct, what is given, is not actually the deviation, but the expected value of the deviation (like the variance, which is the expected value of the square of the standard deviation). We don’t know the actual deviation.

The higher the rating, the stronger the player. Sofiam’s rating is 1655 and Atorrante’s is 1423. Sofian is a better player than Atorrante (which is illustrated by their rank: 3.5 kyu versus 7.0 kyu.

Sofiam’s deviation (249) is almost 4 times as high as Atorrante’s (63). Sofiam is more unpredictable in her playing results than Atorrante.

That means that this isn’t correct. One can compute the chance that Sofiam is better than Atorrante given these numbers. The way one does that is by substracting the means and adding up the variance, and taking a look at the resulting normal distribution.

Computation

So, to do the computation, the difference in rank is 1655-1423 = 232, Sofiam’s variance is 2492=62001 and Atorrante’s variance is 632=3969, hence the total variance is 65970. Reverting back to standard deviation by taking the square root, we have a deviation of roughly 257.

Then we compute how large the chance is that Sofiam is stronger than Atorrante, which is the chance that the difference, given by Sofiam’s rank - Atorrante’s rank, is positive. There’s calculators out there that can do it, like this one.

So, what the rating tells us, is not that Sofiam is stronger than Atorranto, but that there is 81% chance that Sofiam is stronger than Atorrante, and 19% chance that it is the other way around.

Of course this chance is measured by the rating system taking all of the games of both players into account, and under the assumption that player’s strength can actually be assigned a number. To compute the chance whether Sofiam is stronger than Atorrante, this seems like a very dodgy way to do it, instead of just looking at the many games played between the two and comparing those results.


All in all, the rough idea is that the mean gives how strong a player is, and the deviation gives how accurate this prediction is. People who are new or do not play consistently at one strength, have high deviation. Higher rating means that it is likely a stronger player, that becoming likelier the larger the gap is, and / or the smaller the deviation gets.

1 Like

https://forums.online-go.com/t/am-i-alone-in-feeling-like-ogs-is-the-hardest-place-to-find-an-even-game/51326/9
someone again trying to talk about win probability between different ranks, but can’t. Because rating calculator still don’t give such information.

1 Like

I agree this would be useful; this isn’t the first time I’ve wanted it

— signed said someone