Out of curiosity, what information do YOU think the ratings table is meant to provide?
As I a mentioned before, from a UX point of view, tables should be used for comparison between different data that allows comparisons between the other elements on the same table.
Hi @Sarah_Lisa, it would be helpful if you joined the discussion on Kialo, weâve already made a lot of progress breaking down the arguments and we could use your input!
Iâm Maharani over there
Oops! Sorry.
I find this Kialo thing just a pain in a** â because I do not use twitter, fakebook or google I cannot vote.
Why not start a vote here on our OGS forum?? Anyone reading and participating here can vote without getting involved in other disturbing and unsafe âsocial media sitesâ or giving further marketing data to google.
While thatâs probably a rhetorical question, it does have an actual answer. Kialo is not just a âtake a voteâ platform, it is a structured debate platform.
You canât do that here. And really⌠the vast majority of people do actually have a one of those accounts to log in withâŚ
GaJ
Plus you dont need one. You can just make an account with your email address, as I did.
Isnât there a fairly simple solution to all of this?
- Problem: different rating pools cannot be compared.
- Solution: use a metric that can be compared.
Therefore, keep the overall rank, but for the the breakdown table, show a percentile rather than a rating.
If you use percentiles, I might see that if I am better than 50% of players at 19x19 and can infer that I am an average 19x19 player. I might also see that I am better than 70% of 9x9 players and infer that I am a stronger-than-average 9x9 player.
Rating pools cannot be compared. But percentiles can always be compared.
Iâve submitted this suggestion on the kialo page.
Iâm confused by this statement. While the percentiles may be easy to understand, you canât make any inferences between pools. A 90% rating in one area has no bearing on your performance in another area. The example you give here would work the same way as the Glicko and Kyu/Dan methods. If I am a 20kyu 19x19 player and a 9kyu 9x9 player, I know that I am a better 9x9 player.
I must be confused, can you explain why this is a better option?
Is there a way to make a simple vote on the forum? That might be a good idea once we have a solid idea of our options.
Yes. In a reply, click on the âoptionsâ (cog) icon and select the âBuild Pollâ option.
TL;DR
Comparing ranks and ratings across pools gives no useful information. You donât know whether you are comparing the rating pool or your own playing strength. Comparing percentiles allows you to compare your position within the pool. Whilst that doesnât entirely match your playing strength, most pools are sufficiently large that you can infer your playing strength from your position within the pool.
The most extreme example is to compare performance across different games:
- On most Chess websites I am stronger than 95% of players.
- On most Go websites I am stronger than 50% of players.
- Therefore, I am stronger than a higher proportion of Chess players than Go players.
Comparing ranks and ratings
Sure thing! But first, letâs just clear this up:
If I am a 20kyu 19x19 player and a 9kyu 9x9 player, I know that I am a better 9x9 player.
Actually, thatâs not quite correct. How do you know youâre not equally strong at 19x19 and 9x9? The difference in rank might be because of the difference in rating pools rather than a difference in playing strength. We need more information before we can infer how much playing strength was a factor.
Letâs restate this by introducing different Go servers to demonstrate where it goes wrong:
If I am a 15k on KGS and a 10k on OGS, then I know that I am a better OGS player.
Thatâs not the conclusion most people would reach. Instead, theyâd probably suggest that 15k on KGS is about equal to 10k on OGS. Youâre the same player! Itâs the rating pools that are different.
But of course with different board sizes we introduce an extra factor:
If I am a 15k at 19x19 on KGS and a 10k at 9x9 on OGS, then I know that I am a better 9x9 on OGS player.
Comparing KGS ranks to OGS ranks was difficult enough already. Really, this should be:
If I am a 15k at 19x19 on KGS and a 10k at 9x9 on OGS, then a 19x19 15k on KGS equates to a 9x9 10k on OGS.
We have learned nothing new from our comparison. Now we can make sense of the original quote:
If I am a 20kyu 19x19 player and a 9kyu 9x9 player, then 20kyu at 19x19 equates to 9kyu at 9x9
Therefore there is no benefit to showing multiple ranks or ratings across different board sizes or game speeds.
Percentiles are different
When I join a new Go server I cannot predict in advance what my rank will be. But I can predict what my percentile will be:
- I am stronger than 50% of OGS players.
- KGS players are as diverse as OGS players.
- I am therefore stronger than ~50% of KGS players.
This is not perfect, as it relies on KGS players being as diverse as OGS players. They probably are, Iâm not sure, but the difference is not going to be significant when enough players are in the pool.
The result is a slightly different conclusion to just being stronger:
- I am stronger than 50% of 19x19 players.
- I am stronger than 80% of 9x9 players.
- Therefore, I am stronger than a larger proportion of 9x9 players than 19x19 players.
This is not the most amazing conclusion in the world. But it is at least a conclusion - which is more than we got comparing ratings and ranks!
Conclusion
Comparing ranks or ratings gives us no information across pools:
If I am a 20kyu 19x19 player and a 9kyu 9x9 player, then 20kyu players at 19x19 equate to 9 kyu players at 9x9.
Comparing percentiles gives us a lot of information across pools:
If I am stronger than 50% of 19x19 players and stronger than 75% of 9x9 players, then I am stronger than a larger proportion of 9x9 players than 19x19 players.
I skipped a lot of the walls of text in the last third of this thread. But my 2c are that the best options are either removing the table because the information doesnât outweigh the confusion, or as @Farraway is suggesting, switch them to percentile display, so that the table provides more information and less confusion.
After some consideration, I believe that the developers probably made a good decision to display ratings instead of ranks in the table. The average ability of players on OGS is different for different board sizes as shown in the rank histogram posted by @S_Alexander. Using the same formula used for the overall rank would be incorrect.
However, the ratings can be useful. They can be used to check that a game is even on a given board size.
If the volatility were also displayed, it would be possible to calculate the expectation value for the result of a match. The volatility is currently available via the API. In the future, perhaps we could specify that we would like a game within a given range of expectation values?
I feel the same way as Ptro about the rating system. I think that the traditional dan/kyu system is easier and more important than the Glicko rating. I think that the Glicko was a good idea but I feel that it is overall unnecessary because of how precise the system tries to be and how difficult it is to predict the precise rating even in the kyu/dan system. Of course I would also like to point out that a rating is just a number and is never perfect!!
Thanks for bringing this topic up and I would like to thank OGS for having such a good system anyway:)
Itâs not either Glicko OR Dan/Kyu.
Weâve always had Dan/Kyu ranking in the same way we do now - based on an underlying rating system.
Before it was ELO. And we had and ELO number just like our Glicko number, and we had a graph showing it.
The only thing that wasnât shown to us before was our ELO number for all the different game types!
The only difference now is that weâre being shown some numbers we werenât shown before, and finding them confusing.
GaJ
You need only the rating (and deviation) to calculate the probability to win the game.
I wanted to link S_Alexanders post as well.
Quite right. Only the two ratings and the opponents deviation are needed.