Show Kyu/Dan instead of Glicko Rating on Player Profile

Maharani · August 20, 2018, 3:15am

Out of curiosity, what information do YOU think the ratings table is meant to provide?

Ptro · August 20, 2018, 3:18am

As I a mentioned before, from a UX point of view, tables should be used for comparison between different data that allows comparisons between the other elements on the same table.

MystWalker · August 20, 2018, 4:15am

Hi @Sarah_Lisa, it would be helpful if you joined the discussion on Kialo, we’ve already made a lot of progress breaking down the arguments and we could use your input!

Maharani · August 20, 2018, 4:25am

I’m Maharani over there

MystWalker · August 20, 2018, 4:26am

Oops! Sorry.

Musash1 · August 20, 2018, 8:36am

I find this Kialo thing just a pain in a** – because I do not use twitter, fakebook or google I cannot vote.

Why not start a vote here on our OGS forum?? Anyone reading and participating here can vote without getting involved in other disturbing and unsafe “social media sites” or giving further marketing data to google.

Eugene · August 20, 2018, 8:43am

While that’s probably a rhetorical question, it does have an actual answer. Kialo is not just a “take a vote” platform, it is a structured debate platform.

You can’t do that here. And really… the vast majority of people do actually have a one of those accounts to log in with…

GaJ

Maharani · August 20, 2018, 10:53am

Plus you dont need one. You can just make an account with your email address, as I did.

Farraway · August 20, 2018, 4:43pm

Isn’t there a fairly simple solution to all of this?

Problem: different rating pools cannot be compared.
Solution: use a metric that can be compared.

Therefore, keep the overall rank, but for the the breakdown table, show a percentile rather than a rating.

If you use percentiles, I might see that if I am better than 50% of players at 19x19 and can infer that I am an average 19x19 player. I might also see that I am better than 70% of 9x9 players and infer that I am a stronger-than-average 9x9 player.

Rating pools cannot be compared. But percentiles can always be compared.

Farraway · August 20, 2018, 4:53pm

I’ve submitted this suggestion on the kialo page.

MystWalker · August 20, 2018, 5:25pm

I’m confused by this statement. While the percentiles may be easy to understand, you can’t make any inferences between pools. A 90% rating in one area has no bearing on your performance in another area. The example you give here would work the same way as the Glicko and Kyu/Dan methods. If I am a 20kyu 19x19 player and a 9kyu 9x9 player, I know that I am a better 9x9 player.

I must be confused, can you explain why this is a better option?

MystWalker · August 20, 2018, 5:27pm

Is there a way to make a simple vote on the forum? That might be a good idea once we have a solid idea of our options.

opuss · August 20, 2018, 5:36pm

Yes. In a reply, click on the “options” (cog) icon and select the “Build Poll” option.

Farraway · August 20, 2018, 9:41pm

TL;DR

Comparing ranks and ratings across pools gives no useful information. You don’t know whether you are comparing the rating pool or your own playing strength. Comparing percentiles allows you to compare your position within the pool. Whilst that doesn’t entirely match your playing strength, most pools are sufficiently large that you can infer your playing strength from your position within the pool.

The most extreme example is to compare performance across different games:

On most Chess websites I am stronger than 95% of players.
On most Go websites I am stronger than 50% of players.
Therefore, I am stronger than a higher proportion of Chess players than Go players.

Comparing ranks and ratings

Sure thing! But first, let’s just clear this up:

If I am a 20kyu 19x19 player and a 9kyu 9x9 player, I know that I am a better 9x9 player.

Actually, that’s not quite correct. How do you know you’re not equally strong at 19x19 and 9x9? The difference in rank might be because of the difference in rating pools rather than a difference in playing strength. We need more information before we can infer how much playing strength was a factor.

Let’s restate this by introducing different Go servers to demonstrate where it goes wrong:

If I am a 15k on KGS and a 10k on OGS, then I know that I am a better OGS player.

That’s not the conclusion most people would reach. Instead, they’d probably suggest that 15k on KGS is about equal to 10k on OGS. You’re the same player! It’s the rating pools that are different.

But of course with different board sizes we introduce an extra factor:

If I am a 15k at 19x19 on KGS and a 10k at 9x9 on OGS, then I know that I am a better 9x9 on OGS player.

Comparing KGS ranks to OGS ranks was difficult enough already. Really, this should be:

If I am a 15k at 19x19 on KGS and a 10k at 9x9 on OGS, then a 19x19 15k on KGS equates to a 9x9 10k on OGS.

We have learned nothing new from our comparison. Now we can make sense of the original quote:

If I am a 20kyu 19x19 player and a 9kyu 9x9 player, then 20kyu at 19x19 equates to 9kyu at 9x9

Therefore there is no benefit to showing multiple ranks or ratings across different board sizes or game speeds.

Percentiles are different

When I join a new Go server I cannot predict in advance what my rank will be. But I can predict what my percentile will be:

I am stronger than 50% of OGS players.
KGS players are as diverse as OGS players.
I am therefore stronger than ~50% of KGS players.

This is not perfect, as it relies on KGS players being as diverse as OGS players. They probably are, I’m not sure, but the difference is not going to be significant when enough players are in the pool.

The result is a slightly different conclusion to just being stronger:

I am stronger than 50% of 19x19 players.
I am stronger than 80% of 9x9 players.
Therefore, I am stronger than a larger proportion of 9x9 players than 19x19 players.

This is not the most amazing conclusion in the world. But it is at least a conclusion - which is more than we got comparing ratings and ranks!

Conclusion

Comparing ranks or ratings gives us no information across pools:

If I am a 20kyu 19x19 player and a 9kyu 9x9 player, then 20kyu players at 19x19 equate to 9 kyu players at 9x9.

Comparing percentiles gives us a lot of information across pools:

If I am stronger than 50% of 19x19 players and stronger than 75% of 9x9 players, then I am stronger than a larger proportion of 9x9 players than 19x19 players.

BHydden · August 21, 2018, 9:30pm

I skipped a lot of the walls of text in the last third of this thread. But my 2c are that the best options are either removing the table because the information doesn’t outweigh the confusion, or as @Farraway is suggesting, switch them to percentile display, so that the table provides more information and less confusion.

opuss · August 21, 2018, 10:03pm

After some consideration, I believe that the developers probably made a good decision to display ratings instead of ranks in the table. The average ability of players on OGS is different for different board sizes as shown in the rank histogram posted by @S_Alexander. Using the same formula used for the overall rank would be incorrect.

However, the ratings can be useful. They can be used to check that a game is even on a given board size.

If the volatility were also displayed, it would be possible to calculate the expectation value for the result of a match. The volatility is currently available via the API. In the future, perhaps we could specify that we would like a game within a given range of expectation values?

Go_Board · August 22, 2018, 3:10pm

I feel the same way as Ptro about the rating system. I think that the traditional dan/kyu system is easier and more important than the Glicko rating. I think that the Glicko was a good idea but I feel that it is overall unnecessary because of how precise the system tries to be and how difficult it is to predict the precise rating even in the kyu/dan system. Of course I would also like to point out that a rating is just a number and is never perfect!!

Thanks for bringing this topic up and I would like to thank OGS for having such a good system anyway:)

Eugene · August 23, 2018, 5:00am

It’s not either Glicko OR Dan/Kyu.

We’ve always had Dan/Kyu ranking in the same way we do now - based on an underlying rating system.

Before it was ELO. And we had and ELO number just like our Glicko number, and we had a graph showing it.

The only thing that wasn’t shown to us before was our ELO number for all the different game types!

The only difference now is that we’re being shown some numbers we weren’t shown before, and finding them confusing.

GaJ

flovo · August 23, 2018, 12:37pm

You need only the rating (and deviation) to calculate the probability to win the game.

I wanted to link S_Alexanders post as well.

opuss · August 23, 2018, 1:18pm

Quite right. Only the two ratings and the opponents deviation are needed.