Show Kyu/Dan instead of Glicko Rating on Player Profile

opuss · August 15, 2018, 11:03pm

Yes. For someone who has only played 9x9 you might expect the overall rating to be the same as the 9x9 rating. However, I would guess that the overall rating is calculated using your opponent’s overall ratings which will include 13x13 and 19x19 games. This would account for the difference.

It would be useful if a developer could confirm this, or even better let us see the relevant back-end source code.

Vsotvep · August 16, 2018, 4:00am

Whether you use glicko or kyu/dan doesn’t really matter a whole lot to me. Both are easy to understand on a fundamental level: the higher the number (regarding kyu as negative), the better you are.

Supposedly one handicap equals one kyu/dan rank, which is roughly 60 Glicko score, a little less with weaker players, a little more with stronger players, due to Glicko being logarithmic. However, in the end both scores are just numbers. You can’t express someone’s strength purely with a single number, so it does appear to be a “cosmetical problem” rather than a “serious problem” to me. In no way does it disturb me from finishing my task (playing go against other people and having a rough estimate of who of us is the better player based on some numbers: both perfectly undisturbed), but it does seem to annoy some people.

What I have a “serious problem” with, is that this is my rating table:

It does not make sense to me that my ‘overall rating’ (which one would generally assume to be the average of all nine the categories) is higher than every single individual rating. It therefore clearly is not the average, so what is it then? I play almost exclusively correspondence and 19x19, so I can accept that the other ratings are not up to date, but then why is my 19x19 correspondence rating 150 points (or 2.5 kyu) less than my overall performance?

This is what makes the table unusable, not whichever arbitrary ranking system it’s using.

Eugene · August 16, 2018, 9:33am

Wrong assumption.

As I understand it, each rating is calculated from the pool of games in that rating, separately.

If you click on the rating itself, you can see the progress of your rating in that category, as the separate thing.

Since a different set of games goes into each rating (including your overall one, which includes all games) and there are a different set of opponents associated with each, they do not roll up - the are separate related but independent calculations telling you how you are going in each pool.

As I understand it.

GaJ

Eugene · August 16, 2018, 9:34am

That is wrong.

We have one rating system, and it is Glicko. Our performance is tracked and rated using this system.

From it we calculate a rank, to allow us to have easy to understand and compare with other sites ranks.

Eugene · August 16, 2018, 9:41am

I think that toomasr has pointed out the one real problem with the current table and system.

He wants to use his 9x9 rank to find games with other 9x9 playing people using their rank.

It has been asserted (IIRC by @anoek) that our overall rank is what is used in finding games.

If that were true, then toomasr’s method would not work.

However, others (IIRC @Sarah_Lisa ?) have pointed out evidence that our game matchups in fact are using the sub ranks. toomasr’s method backs this up. I presume he gets a different rank for himself when he creates a game depending on the game type that he creates[1]

And if that is true, then it is indeed a big gap that we don’t get shown what our sub-ranks are. We would need that information to properly find the right players to play with.

The argument would be this:

1 Game matchups are based on my sub-type of game ranking
2 But nothing tells me what my sub-type rank is, I have to calculate or workaround to find out.
3 That sucks.

This only holds up if ‘1’ is true.

If ‘1’ is not true, then there is no practical purpose to know the equivalent rank of your sub-type rating. The trend of the Glicko rating is all you need to know: going up or down etc.

GaJ

1: I haven’t tried this myself, maybe I should. Maybe I will.

flovo · August 16, 2018, 10:10am

I had a quick look at the custom games tables. The rank shown there is always the overall rank, not a rank derived form any of the breakdown ratings.

I tried for myself with my very different corresponding ratings. All 3 games shown the overall rank.

Eugene · August 16, 2018, 10:29am

In that case,

@toomar’s elaborate method would seem to be for nothing.
There is no point in showing the rank equivalent of the glicko rating for the sub-types. That rank is not used for anything.

GaJ

Ptro · August 16, 2018, 1:42pm

I was pretty sure that someone would probably get confused with this definition, but I didn’t wanted to extend the text any longer. You example is fundamentally wrong on what you are considering as a task which depends on the rating table. The ratings table was created, from at least what it seems right now, to complete the following task: “I want to see and compare my ranking in different board and different speeds”. When talking about the rating table, this is the task being considered, since it’s the only one that could answered by using it.

What is or isn’t a task is considered a task in UX is a bit complex subject, and to my opinion beyond the scope of this forum. But simplifying a lot, think this way : Tasks are question that the users asks in his mind when using a site/app. One site, most of the time, never serves only one task. The player profile in OGS, for instance, serves many many tasks, example: “I want to see my history, where do I see it?” “I want to see my reviews, where do I see it?” “I want to see my ranking development over a time, where do I see it?” If still unclear for you please say it, so I can clear any misunderstanding.

Well, I don’t think you have read my point entirely, just the bold question. Cause if you did had continued for a few more lines you would that notice that this is precisely what I am arguing about.

Just by having a discussion about what ranking is being used in different situations already show crystal clear to me that really exists a big UX problem in the ratings table. The user shouldn’t even have to consider a possibility like this. It should always be clear to him what is his ranking being used, since this is a fundamental information when playing Online Go. I’ll put this question under discussion again: what purpose is the ratings table is supposed to be serving right now? Is it really being successful in it?
Well, this simply doesn’t makes any sense. First, this information is already being shown, so the problems that could be avoided by hiding it already exists (see point 1 and also reread your mention about Sarah_Lisa). Second, the information is being shown in a way which few people can understand (please look for what I am considering as “understand” before discussing about the use of this word), which makes unusable for the most people. And yes, I do know how this may seem contradictory in a first look, but please consider that it’s always necessary to consider the multitude of background from with OGS users come from, and that different users have different assumptions even when using the same interface.

I also would like to point out that I am not only discussing about the player ratings table, since the graph (in some cases) also privileges Glicko instead of kyu/dan.

Vsotvep · August 16, 2018, 2:05pm

In that case, I still don’t see how I wouldn’t be able to compare my rating to yours: if my number is higher, I’m stronger, if yours is higher you’re stronger. If it’s higher by a lot, it’s a lot stronger.

Inherently kyu/dan ranks also don’t mean a whole lot. Someone is 5k on this site, 10k on that site, 1d in his local club, it’s pretty arbitrary. I just discovered that even on this site alone we’re using 10 (or maybe 16?) different pools that don’t correspond with each other. So I don’t see a huge problem with comparing Glicko instead of kyu/dan.

I can understand the desire for consistency across the site though, so I agree on that point that it’s better to show the same kind of rating everywhere. I think that’s the main issue on this question, rather than that there is something that’s difficult to understand about the glicko rating.

Ptro · August 17, 2018, 12:46am

Indeed, if a simple 2 by 2 comparison wasn’t possible than it would be a “Catastrophic problem”, not a “Serious Problem”. The issue with “Serious Problems” is that you can finish what you want, but you must work around the problem. The user is led to expect a direct path to the result, but there is none available.

Again, let me use another example: Imagine if to login in OGS you had to first enter on the about page. You never expect to login in any website to be on the about page. It does work perfectly, if you input the correct information you are successfully logged in. But do you really think that is a expected behavior from a user standpoint?

The problem when using Glicko to show information in the user profile instead of kyu/dan is that isn’t expected from a Go player to understand Glicko (again, not only just read it, but fully make use of it), while it’s extremely expected from a user to understand (same definition as above) kyu/dan. And because of that, the user can’t utilize the information in a way that is expected from him. Again, how can this not be a “Serious Problem” ?.

It never was about the complexity of Glicko, but rather how it doesn’t allow such a intuitive use as kyu/dan does. If OGS at least did provide the user with enough information about the Glicko system so that he can utilize the same way he currently utilizes kyu/dan (such as a info box for instance) than there would be no reason for this topic even exist in first place.

I will use another example to illustrate my point, in hopes that this time it can be fully comprehended. The metric/imperial system. Can you utilize both of them, even if you are just versed in one? Sure. Can you easily understand that 20 feet is bigger than 10 feet, or that 100 meters is bigger than 50 meters? Sure. Can you translate measurements between the two? With some help, but sure.

Now, here is the catch. If you born and grew up only utilizing the metric system (my case for instance), can you have the correct dimension of what 80 feet represents? No, I don’t have a clue. It’s it bigger than a football stadium or smaller than car? I don’t know. Could I search for the information and then figure it out? Sure. But then, if right now isn’t anymore 80 feet, but 135 feet instead. Do I truly have the dimension of what 135 feet means? No. I could guess based on the 80 feet information, but even then would not be very accurate. Now, If you come to me and says “135 feet is equal to 41,148 meters.” then I would, without a doubt, have an extremely good dimension of what you are talking about.

Eugene · August 17, 2018, 2:09am

You know what, I think I have come to agree with this statement:

“The table of rating numbers for different game types should be displayed as rank”.

Can anyone succinctly summarise why this should not be the case?

I vaguely recall in the past an argument why it isn’t that way, but I can’t remember what that argument is.

Ptro · August 17, 2018, 2:19am

From what I can recall quickly, someone said that the devs thinking it’s misleading, since it’s using different scales. What, for me, doesn’t make any sense, since they are already showing the same information, but in a not so friendly manner. Someone versed in Glicko should be able to reach the same conclusions that they are trying to avoid the users from doing it in the first place.

flovo · August 17, 2018, 2:28am

The official statements about that.

Ptro · August 17, 2018, 3:23am

Well, I’m glad that finally someone has cleared that up for me. However, I would like to use the same reasons I listed within this topic to again discuss changes in the player rating table.

Just to recapitulate: (all points have been discussed more in depth along this topic):

The current player rating table, as it is, doesn’t allow the user to use it as expected, since the user can’t correctly interpret the information that it supposed to have (especially considering that this information can’t be correct mapped to kyu/dan)
It may misled the user into thinking that a specific ranking is used for a specific type of matches (examples have been shown along this topic)
Its only current functional use is to act as a filter for the graph (which again is only showing in Glicko which is unreadable for the end user, but let’s leave it aside for now)

That said, my suggestion for now would be remove the players rating table and integrate it with the graph. That way we don’t have removal of a working functionality and the only information being removed is one that already only serves to confuse users.

Kosh · August 17, 2018, 4:18am

My reasons for agreeing hav already been covered in depth here: Rating anomaly? Is this a bug?

Musash1 · August 17, 2018, 6:52am

Thank you very much, @flovo, for providing this information. As the first quote from @flovo indicates:

there is nothing sacred about keeping the breakouts!

What I find strange is the latter part of that quote: “… at this point they are strictly informational.” Informational? Quite the contrary! They are confusing and pretty much useless given that “they are on different scales” which no one understands or can convert into an understandable form – as is made clear in the second quote:

Isn’t it time to remove these breakouts? I regularly read the forum postings and there are regularly questions, often not directly but in essence, concerning the use and interpretation of these breakouts. And no one can provide a clear and succint reply/explanation.

The experimental phase has run long enough, and the results are quite clear:

Conclusion: the breakouts are worthless.

Eugene · August 17, 2018, 7:26am

Let me try.

Each breakout measures your performance against the other people and games in this pool.

You can use the raw number you see there to:

Compare your performance in this pool to that of others, for the same pool.
** Bigger numbers mean you performed better
Track your performance over time in this pool

It’s really not that hard to understand.

It only gets hard if you start making it hard by asking “So, am I a 10k 9x9 correspondence player?”.

The answer to this question is

“That question can’t be answered by this table, nor by any mechanism we have. We only rank overall performance.”

GaJ

Kosh · August 17, 2018, 8:05am

Not quite though I agree with the sentiment.

This would appear to be completely true as best I can tell but I don’t feel that this limited use for the data justifies it’s pride of place on the Profile page.

Also true for anyone that has read a few forum posts but it is not easy to understand from the OGS site itself.

Upon seeing such a table at the top of my Profile page; I would instinctively like to be able to use it to judge which board sizes I am performing better on and by how much. Also; which time settings I’m okay with and where I might be weaker. None of this is possible with the data provided whereas the possibilities for being mislead by the data are enormous as various posts attest.

I would love it if the numbers on the Ratings Table could be mathematically normalised to be comparable but I judge this to be extremely unlikely in the near future so I return to my earlier assertion:

opuss · August 17, 2018, 1:30pm

I started using the 19x19 live ratings when I participated in smurph’s DDK experiment.

They are useful if you want to find an opponent of roughly equal ability.

First convert the 19x19 rating into a rank. Use this rank to create a custom game. Unfortunately you can only specify a rank and not a rating in the custom games dialog.

Then when someone accepts a game, check their profile to see if their 19x19 rating is within a suitable range. If not, just cancel the game before your first move and apologise in chat. This avoids playing people who are good at 9x9 but haven’t played much 19x19. The deviation can also be useful in such cases.

This method works but is slightly cumbersome. It also eats into your time if you are white, which is a minor disadvantage in live games.

Ptro · August 17, 2018, 3:29pm

Everything you said could, and would, be more easily interpreted by the user if the table was integrated inside the graph. Right now I can’t see a use which justifies it be separated from with it.

Well, to me what you just said seem as a motive to remove the table, not the opposite. It’s impossible to anyone, at least I hope so, to dictates what someone can or cannot think. If an interface is allowing the user in a way that he can misled, then this interface has problems who must be solved.

Totally agree. As I said before, the way it’s right now induces to user into error, as been proven many times. The way GreenAsJade seems a use for it is already contemplated by the graph, keeping the filter functionality would not only keep as useful funcionalities in place but also remove the ones that only adds confusion to the user .

Well, sorry to say that, but had you just did is mathematically wrong, as been proven by a anoek quote. It’s impossible to correctly map ratings in to rankings, which is, again, another error that the current rating table leads the user into. Also needless to say, but the usage described along your post isn’t the expected one from a OGS user. I’m all in for different ways of using the tools we have in ours disposal, but since this one is grounded on a false idea of skill, then I can’t justify its continued use.