Show Kyu/Dan instead of Glicko Rating on Player Profile

DVbS78rkR7NVe · August 25, 2018, 12:18pm

Check lichess out.

https://lichess.org/stat/rating/distribution/classical

Ptro · August 26, 2018, 11:04pm

I finally had enough time to participate in the discussion again, so allow me to get back from where I let off.

First, thank you all for keeping the discussion going. I was afraid that it would lose momentum in that meantime, really glad to see that I was wrong.

Before getting in specifics, let me address what I think, in general, about this topic:

@Farraway suggestion is a pretty good one, actually. It, in fact, does solve the problem at hand in a alternative way. Now, it actually does have some problems too. But if Farraway alternative is implemented and the existing problems within solved, then I would very much welcome that instead of the present table. Of course, I still personally think that integrating the table with the graph would be a more interesting one, but both of them have theirs pro and cons.

Now getting into the specifics,

I would be extremely against it. When you provide the expected outcome from a match to a player you are setting yourself for trouble. Reasons for it:

If a match-up isn’t exactly 50/50 then the players would always blame the matchmaking system for creating such “unfair matches” (See a forum for any multiplayer game which displays the matchmaking information for instance)
Players would generally avoid playing in matches when they don’t have a clear lead that they may win, which is would lead into a situation that everyone always want to “just win” and never lose and it would be bad for the community in general.

(For context purposes, here Farraway was talking about how kyu/dan can’t be used as a accurate measurement between different rating pools.)

Yes, you are right actually. For me this is, without a doubt, the highlight in your proposal versus a kyu/dan one. Now, we also have to consider that kyu/dan was used before in that same situation (and please note that right now I not objecting what you said). It leaves me very curious to understand how the previous implementation approached this type of situation, and which was the devs point of view about this before changing to Glicko.

This is a problem with the percentile proposal as it is. It must make it very clear that is taking site-wide information and not only from the overall rank with some deviation. Or if is doing the opposite, then it should make this very clear either. It may even turns out to be easy fixable, but it shows how using percentile could lead users into some wrong interpretations if not correctly implemented.

Flovo interpretation of how a percentile-based display of subratings should work is also a much more useful one (even if it is way more complex, and maybe not even possible, implementation). From what I understood, would be something like this:
Screenshot_2018-08-26%20OASIS%20-%20Result(1)
(explaining the image for the Overwatch-illiterate ones, this is a unofficial website that uses data provided from the official Overwatch API to analyze a specific player performance [this isn’t my data by the way] using a deep network to interpret and see where you should improve on. The points on the curve refers to some aspects that are both special attacks and gameplay in general oriented)

and then each point would refer to different board/time sub-rating.

yebellz:

I agree that percentiles can objectively determine that a player is “stronger than a larger proportion of 9x9 players than 19x19 players” (with respect to the corresponding subsets at OGS), however, I think that this objective statement itself can be a bit misleading, since it is ambiguous as to the root cause, which may be:

The player in question is “stronger” at 9x9 than 19x19 (relative to the global population of go players for each board size).

The population of 19x19 players at OGS is “stronger” than the population of 9x9 players at OGS.

It is quite possible that it is due to a mixture of both causes, and it’s impossible to rule out one over the other without more information about the populations of players. However, I believe that many people, when viewing the percentile comparison, might erroneously interpret it as only the first possibility listed above.

That is my main concern with @Farraway proposal. I also do agree that this situation is very likely to occur and would be misleading to the users. Creating a warning to explain this situation to a end user even inside a infobox in OGS would not be very simple either, since the wording of it may even make more questions pop up on the user head then without it, which isn’t exactly desired from both a UX and UI design point of view.

Not necessarily. See Master Overwatch approach for instance. It also uses a percentile to display numerous stats about the player performance within a specific character, without even using a single graph. Besides, we already have a line graph in the player profile, adding more one graph to display a different type of information may add even more confusion.

Also, just reminding for anyone just joining in, or that missed the link, that we are also using this kialo link for the discussion. Everyone is welcome to join in, and if anyone has a problem when trying to use it you can post here and/or in the kialo chat that this also avaliable in the link above so that we can help you with that. Of course, just posting in this thread is also much appreciated.

opuss · August 26, 2018, 11:35pm

Using the current matchmaking system to get an even 19x19 game can lead to a disappointing match, as I pointed out in an earlier post. I can be matched with an opponent with the same overall rank as me, but they may not be as good at 19x19 as they are at 9x9. This is especially true at the beginner level. In one case I was matched with an opponent who had never played 19x19 before.

Does everyone really want to “just win”? I often like to play games with an opponent who is slightly better than me, where I am more likely to lose than win.

Ptro · August 26, 2018, 11:45pm

In that specific case, I’m pretty sure that there isn’t a single matchmaking system that could take this situation in consideration. As you same said, your opponent hadn’t played any 19x19 games, so it would not have sufficient data data anyway.

You just ignored what I was referring to. I said that (as even you quote shows) if you show the expected outcome to a player then you are stimulating the player to only engage in matches where his win is expected by a good margin.

DVbS78rkR7NVe · August 26, 2018, 11:52pm

How is it different from current system? All players more or less know whether they’re expected to win or not. We all know that if you take on 5 ranks weaker player, you’re expected to win, if you play 3 ranks weaker, still a good chance to win, etc. Yet a lot of people like and do play even games or games against stronger opponents. On KGS you even can get marked for playing only stronger (while no one cares if you whoop some volunteering weaklings).

Edit: KGS mark explanation for OGS-only people KGS: The Tilde

Ptro · August 26, 2018, 11:55pm

Know more or less is very different from explicit showing the expected outcome of a match.

EDIT: Whoops, misunderstood your KGS example, removing it

DVbS78rkR7NVe · August 27, 2018, 12:11am

Here’s an idea. The problem at the moment is that we can’t compare different pools of players. How about making a big table comparing glicko ratings for each category (like https://senseis.xmp.net/?RankWorldwideComparison). Now we know that for example, 2000 at 19x19 is equal on average 1500 9x9 and is equal 2000 (~3k) overall, on average. So we slap 3k label at all of them. And now when some players have 1500 in 9x9, we show 3k, and it means that these players play on 9x9 like we would expect 3k overall play. While their overall rank maybe 7k, so they’re playing better than expected at 9x9.

Does that make sense for anyone?

opuss · August 27, 2018, 12:13am

Correct. The point I was making was that 19x19 and 9x9 ranks may differ, especially at the beginner level. In these cases using the 19x19 rank would lead to a better match than using the overall rank.

HHG · April 25, 2020, 1:23pm

Um, side note: your quote regarding Ptro didn’t show up as a quote; you used [quote="Ptro’], interchanging between the " and the ’ symbols.