Show Kyu/Dan instead of Glicko Rating on Player Profile

I was pretty sure that someone would probably get confused with this definition, but I didn’t wanted to extend the text any longer. You example is fundamentally wrong on what you are considering as a task which depends on the rating table. The ratings table was created, from at least what it seems right now, to complete the following task: “I want to see and compare my ranking in different board and different speeds”. When talking about the rating table, this is the task being considered, since it’s the only one that could answered by using it.

What is or isn’t a task is considered a task in UX is a bit complex subject, and to my opinion beyond the scope of this forum. But simplifying a lot, think this way : Tasks are question that the users asks in his mind when using a site/app. One site, most of the time, never serves only one task. The player profile in OGS, for instance, serves many many tasks, example: “I want to see my history, where do I see it?” “I want to see my reviews, where do I see it?” “I want to see my ranking development over a time, where do I see it?” If still unclear for you please say it, so I can clear any misunderstanding.

Well, I don’t think you have read my point entirely, just the bold question. Cause if you did had continued for a few more lines you would that notice that this is precisely what I am arguing about.

  1. Just by having a discussion about what ranking is being used in different situations already show crystal clear to me that really exists a big UX problem in the ratings table. The user shouldn’t even have to consider a possibility like this. It should always be clear to him what is his ranking being used, since this is a fundamental information when playing Online Go. I’ll put this question under discussion again: what purpose is the ratings table is supposed to be serving right now? Is it really being successful in it?

  2. Well, this simply doesn’t makes any sense. First, this information is already being shown, so the problems that could be avoided by hiding it already exists (see point 1 and also reread your mention about Sarah_Lisa). Second, the information is being shown in a way which few people can understand (please look for what I am considering as “understand” before discussing about the use of this word), which makes unusable for the most people. And yes, I do know how this may seem contradictory in a first look, but please consider that it’s always necessary to consider the multitude of background from with OGS users come from, and that different users have different assumptions even when using the same interface.

I also would like to point out that I am not only discussing about the player ratings table, since the graph (in some cases) also privileges Glicko instead of kyu/dan.

2 Likes

In that case, I still don’t see how I wouldn’t be able to compare my rating to yours: if my number is higher, I’m stronger, if yours is higher you’re stronger. If it’s higher by a lot, it’s a lot stronger.

Inherently kyu/dan ranks also don’t mean a whole lot. Someone is 5k on this site, 10k on that site, 1d in his local club, it’s pretty arbitrary. I just discovered that even on this site alone we’re using 10 (or maybe 16?) different pools that don’t correspond with each other. So I don’t see a huge problem with comparing Glicko instead of kyu/dan.

I can understand the desire for consistency across the site though, so I agree on that point that it’s better to show the same kind of rating everywhere. I think that’s the main issue on this question, rather than that there is something that’s difficult to understand about the glicko rating.

2 Likes

Indeed, if a simple 2 by 2 comparison wasn’t possible than it would be a “Catastrophic problem”, not a “Serious Problem”. The issue with “Serious Problems” is that you can finish what you want, but you must work around the problem. The user is led to expect a direct path to the result, but there is none available.

Again, let me use another example: Imagine if to login in OGS you had to first enter on the about page. You never expect to login in any website to be on the about page. It does work perfectly, if you input the correct information you are successfully logged in. But do you really think that is a expected behavior from a user standpoint?

The problem when using Glicko to show information in the user profile instead of kyu/dan is that isn’t expected from a Go player to understand Glicko (again, not only just read it, but fully make use of it), while it’s extremely expected from a user to understand (same definition as above) kyu/dan. And because of that, the user can’t utilize the information in a way that is expected from him. Again, how can this not be a “Serious Problem” ?.

It never was about the complexity of Glicko, but rather how it doesn’t allow such a intuitive use as kyu/dan does. If OGS at least did provide the user with enough information about the Glicko system so that he can utilize the same way he currently utilizes kyu/dan (such as a info box for instance) than there would be no reason for this topic even exist in first place.

I will use another example to illustrate my point, in hopes that this time it can be fully comprehended. The metric/imperial system. Can you utilize both of them, even if you are just versed in one? Sure. Can you easily understand that 20 feet is bigger than 10 feet, or that 100 meters is bigger than 50 meters? Sure. Can you translate measurements between the two? With some help, but sure.

Now, here is the catch. If you born and grew up only utilizing the metric system (my case for instance), can you have the correct dimension of what 80 feet represents? No, I don’t have a clue. It’s it bigger than a football stadium or smaller than car? I don’t know. Could I search for the information and then figure it out? Sure. But then, if right now isn’t anymore 80 feet, but 135 feet instead. Do I truly have the dimension of what 135 feet means? No. I could guess based on the 80 feet information, but even then would not be very accurate. Now, If you come to me and says “135 feet is equal to 41,148 meters.” then I would, without a doubt, have an extremely good dimension of what you are talking about.

3 Likes

You know what, I think I have come to agree with this statement:

The table of rating numbers for different game types should be displayed as rank”.

Can anyone succinctly summarise why this should not be the case?

I vaguely recall in the past an argument why it isn’t that way, but I can’t remember what that argument is.

4 Likes

From what I can recall quickly, someone said that the devs thinking it’s misleading, since it’s using different scales. What, for me, doesn’t make any sense, since they are already showing the same information, but in a not so friendly manner. Someone versed in Glicko should be able to reach the same conclusions that they are trying to avoid the users from doing it in the first place.

1 Like

The official statements about that.

5 Likes

Well, I’m glad that finally someone has cleared that up for me. However, I would like to use the same reasons I listed within this topic to again discuss changes in the player rating table.

Just to recapitulate: (all points have been discussed more in depth along this topic):

  1. The current player rating table, as it is, doesn’t allow the user to use it as expected, since the user can’t correctly interpret the information that it supposed to have (especially considering that this information can’t be correct mapped to kyu/dan)

  2. It may misled the user into thinking that a specific ranking is used for a specific type of matches (examples have been shown along this topic)

  3. Its only current functional use is to act as a filter for the graph (which again is only showing in Glicko which is unreadable for the end user, but let’s leave it aside for now)

That said, my suggestion for now would be remove the players rating table and integrate it with the graph. That way we don’t have removal of a working functionality and the only information being removed is one that already only serves to confuse users.

5 Likes

:+1:

My reasons for agreeing hav already been covered in depth here: Rating anomaly? Is this a bug?

3 Likes

Thank you very much, @flovo, for providing this information. As the first quote from @flovo indicates:

there is nothing sacred about keeping the breakouts! :latin_cross:

What I find strange is the latter part of that quote: “… at this point they are strictly informational.” Informational? Quite the contrary! They are confusing and pretty much useless given that “they are on different scales” which no one understands or can convert into an understandable form – as is made clear in the second quote:

Isn’t it time to remove these breakouts? I regularly read the forum postings and there are regularly questions, often not directly but in essence, concerning the use and interpretation of these breakouts. And no one can provide a clear and succint reply/explanation. :thinking:

The experimental phase has run long enough, and the results are quite clear:

Conclusion: the breakouts are worthless. :x:

4 Likes

Let me try.

Each breakout measures your performance against the other people and games in this pool.

You can use the raw number you see there to:

  • Compare your performance in this pool to that of others, for the same pool.
    ** Bigger numbers mean you performed better
  • Track your performance over time in this pool

It’s really not that hard to understand.

It only gets hard if you start making it hard by asking “So, am I a 10k 9x9 correspondence player?”.

The answer to this question is

“That question can’t be answered by this table, nor by any mechanism we have. We only rank overall performance.”

GaJ

4 Likes

Not quite though I agree with the sentiment.

This would appear to be completely true as best I can tell but I don’t feel that this limited use for the data justifies it’s pride of place on the Profile page.

Also true for anyone that has read a few forum posts but it is not easy to understand from the OGS site itself.

Upon seeing such a table at the top of my Profile page; I would instinctively like to be able to use it to judge which board sizes I am performing better on and by how much. Also; which time settings I’m okay with and where I might be weaker. None of this is possible with the data provided whereas the possibilities for being mislead by the data are enormous as various posts attest.

I would love it if the numbers on the Ratings Table could be mathematically normalised to be comparable but I judge this to be extremely unlikely in the near future so I return to my earlier assertion:

7 Likes

I started using the 19x19 live ratings when I participated in smurph’s DDK experiment.

They are useful if you want to find an opponent of roughly equal ability.

First convert the 19x19 rating into a rank. Use this rank to create a custom game. Unfortunately you can only specify a rank and not a rating in the custom games dialog.

Then when someone accepts a game, check their profile to see if their 19x19 rating is within a suitable range. If not, just cancel the game before your first move and apologise in chat. This avoids playing people who are good at 9x9 but haven’t played much 19x19. The deviation can also be useful in such cases.

This method works but is slightly cumbersome. It also eats into your time if you are white, which is a minor disadvantage in live games.

1 Like

Everything you said could, and would, be more easily interpreted by the user if the table was integrated inside the graph. Right now I can’t see a use which justifies it be separated from with it.

Well, to me what you just said seem as a motive to remove the table, not the opposite. It’s impossible to anyone, at least I hope so, to dictates what someone can or cannot think. If an interface is allowing the user in a way that he can misled, then this interface has problems who must be solved.

Totally agree. As I said before, the way it’s right now induces to user into error, as been proven many times. The way GreenAsJade seems a use for it is already contemplated by the graph, keeping the filter functionality would not only keep as useful funcionalities in place but also remove the ones that only adds confusion to the user .

Well, sorry to say that, but had you just did is mathematically wrong, as been proven by a anoek quote. It’s impossible to correctly map ratings in to rankings, which is, again, another error that the current rating table leads the user into. Also needless to say, but the usage described along your post isn’t the expected one from a OGS user. I’m all in for different ways of using the tools we have in ours disposal, but since this one is grounded on a false idea of skill, then I can’t justify its continued use.

3 Likes

If GreeAsJade is correct, and the live 19x19 rating just uses a different pool of games, then playing an opponent with a similar live 19x19 rating as mine should lead to an even live 19x19 game.

It should also provide a better indication of progress in live 19x19 games than my overall rating. This is why smurph chose to use it as a measure in his experiment.

The problem is that I can’t easily get a game with a player with similar live 19x19 rating to mine. I have to create a custom game with an overall rank restriction.

3 Likes

This (alone) is not enough. This only gets you opponents who’s overall ranking matches your live ranking.

So then you have to look in their profile when they accept, and cancel the game if they don’t match your live ranking.

Unless flovo was mistaken somehow, and actually our sub-ranking is used for game matching.

It really would be good if @anoek or @matburt could clarify this point.

GaJ

I don’t think this is true. From what I understood from anoek quote, it seems that isn’t a matter of number of played games, it’s a way more complex issue than that. Your 1500 19x19 live rating isn’t necessarily equal to another player 1500 19x19 live rating. The only thing that really matters, and it’s truly possible to use as a means of comparison is the overall rating.

I think you answered to @opuss instead of me, since he was the one who said that. Anyway, I agree with you, the sub ratings aren’t really used (and actually can’t be used) for matchmaking. Again, it’s really needed some changes in that area, this type of confusion shouldn’t even be happening in the first place.

2 Likes

Exactly! I also have to look at the 19x19 rating.

This is where we really need better documentation.

As I understand it, the 19x19 live rating can be used as a comparison between players. What you can’t do is make an accurate comparison between different ratings.

The overall rating can give a bad indication of a player’s 19x19 ability.

3 Likes

What gives you this idea? I don’t think that’s true. If if were true, then your overall 1500 rating would not be equal to another player’s overall 1500 rating.

The only difference between the sub-rating is the pool of games that they come from.

The reason we don’t convert this to rank is because we don’t have a calibration for that.

The calibration from rating to rank is designed to give us ranks that match our old ranks best, and also to match other sites as best as possible.

That calibration is done for the overall pool. It would be different for sub-types. Even if we had data to calibrate sub-types, which we don’t.

That is the reason why sub-type rating is not converted to rank: because it can’t be calibrated.

Personally I would argue “just use the overall rating calibration - it’d be close enough”.

What we know is that right now anoek and mattburt don’t agree with that.

Correct (as I understand it :wink: )

By the way I reached out to anoek, and he said:

“What is used in player matching for auto matches”

“[1:39] anoek: Overall rank”

GaJ.

4 Likes

No, I’m not talking about the overall rating, just about the sub-ratings. Also, that’s what I could understand from this quotes:

To me, they say that since the sub-ratings come from different scales and different ratings pools, they are only related to the overall rating and not on a global scale (19x19 live from one user to 19x19 live from another user), and since the data which each user obtain its ratings is different, each users sub-ratings true meaning is different.

Simplifying, when reading the information above I understood that since each user has a different history, the way that each sub rating meant is different (think relative on a player data instead of absolute all players 19x19 live data).

=============================================================================

Oh, by the way @GreenAsJade, I wrote that before you edited. I am just gonna post that anyway to inform my thought at time. Right now, I think I agree with you interpretation about that, just really would like some sort of final clarification about this from the devs.

2 Likes