Bug: wrong ranking

Oh I was referring to the similar thing in go corresponding to the ‘title’ in chess (or lichess) :wink:

Definitely a player can (and this is exactly what I tried to say, that the rankings are differed by not only the game speed, but also the board size), but the question is, what rank should we display next to the player’s name?

1 Like

In the real world, go rank refers to 19x19 games, played within a certain range of time appropriate for sitting together.

When a go system offers a variety of timings and board sizes, but a particular player prefers to play 9x9 games (as I do), the rank must be an approximation of what the actual rank might be.

That is why I called for researching the literature on automatic rank determination, and gave the Wikipedia article on the subject.

This is a most disorderly and inefficient thread, with great ideas coming up and then being ignored in favor of obvious or incorrect ideas.

Can’t we just list the bugs and ask that they be fixed, please?

1 Like

The OGS developer IS using ELO rating given in that article: ELO rating VS rank on OGS
They’re just applied in different categories separately.

I think most of us believe that the current system has some problem in the ranking system (otherwise there would not be so many complaints). What we are not sure is how we should fix these problems.

Solution 1: Use only one rating for all categories, at the cost of some inaccuracy or ignoring the rankings on the variations.
Most other websites do this, but as temifar pointed out, most sites have a strong bias towards either live or correspondence games. From my personal experience:

(1) LittleGolem counts all board sizes equally with the same weight, and as a result, the 9*9 bot Valkyria9 is over 10 dan: http://www.littlegolem.net/jsp/info/player_list.jsp?gtvar=go19_DEFAULT

–Edit Begin–
//(2) DGS only counts traditional 19*19 games as rated games, and all other variations are unrated.
Thanks to @temifar for pointing it out that the statement above is not accurate. Instead,
(2) DGS tournaments tend to be unrated for non-standard games, but normal single games can be rated for any supported board size: http://www.dragongoserver.net/faq.php?read=t&cat=105#Entry121
–Edit End–

Personally, I think the factor of the board size is more important than the factor of time here.

Solution 2: Keep splitting the categories, but improve the way it works.
Specifically, I think the provisional status should be enabled in each specific category if
(1) the player has not played enough games in that category, or
(2) the player has not played in that category for a long time and the ranks of the player in the other categories are different from the rank in this category a lot.
As SanDiego pointed out, a warning should be added in such cases.

7 Likes

Solution 1 is too extreme. There is no need to use one rating for all time categories and board sizes. This solution is throwing the baby out with the bath water, as the old expression goes.

Solution 2 addresses some of the bugs.

But I hope more research is done. People here are too quick to jump to conclusions.

I have given several examples above of problematic rank calculations. There are thousands more that can be found by anyone with easy access to the player profile database. Problem cases should be examined by computer to see what the problems actually are, statistically.

In the first example I gave in this thread, SUNDJER, the raw ranks are as follows: 15k, 18k, 16k, 19k. For a 9x9 blitz game, his rank should have been presented as 18k.

There are two problems here: first, 18k is way too high, as his claimed rank is 14k. Second, the presented rank for purposes of challenges and handicap calculation was 23k! In case I have to point it out, 23 is not 18, and certainly is not 14. The difference between 14k and 23k is enormous.

The bug seems to be located in two places, first in the blitz calculation. This can be fixed by adjusting the constant used in the blitz calculation (as compared with 19x19 normal timing). The second part of the bug seems to be in the calculation used for the “overall rank”. Either this calculation should be fixed, or the “overall rank” should be eliminated entirely.

Hope this helps.

That’s not entirely accurate. DGS tournaments tend to be unrated for non-standard games, that’s true. But tournaments are still somewhat experimental there. Normal single games can be rated for any supported board size: http://www.dragongoserver.net/faq.php?read=t&cat=105#Entry121

1 Like

I like the points you make here. Since there are four different ranks, it does not make sense that playing in only one time mode will remove the provisional status from all the ranks. I believe at KGS if you go too long without playing ranked games you get a “?” rank. So I think it would make sense to keep provisional ranks if someone doesn’t play in a certain mode often enough, and have their provisional rank in be the same as their overall proven rank.

If my memory is correct games played against a provisional opponent have no effect on your rank. So at least this way if there is an imbalance, you’re helping the provisional player “fix” their ranking without harming your own rank.

3 Likes

I actually like the idea of provisional ratings for different game speeds. If I’m not mistaken, OGS’ ELO system sets the K value according to rank, the lower the rank, the higher the K value. We could instead set the K value based on the number of recent games in a certain speed. So if someone has never played in a given speed, that person’s rank should be adjusted faster. And even if a person has played a lot of games in a certain speed, but stops, and then comes back after a while, that rank would be less trustworthy, and should receive a higher K.

Though, if the devs would go all the way to try to change this, I think it might be worth it to try to address the multi-dimensional ranks issue, while they’re at it. I haven’t thought much about that though.

2 Likes

If I make a new account and set it to 25k, even though I’m 8k is this a bug?

The rating system uses game results and the ranks of the players to calculate rank adjustment. Your opponent hadn’t played enough blitz games for the system to calculate a realistic rank for this game type. Not a bug, but a lack of data needed to calculate it correctly.

There is only 1 ranking in Go, really? Do you always play at your rank every game you play? rank is just an average of results over time.

1 Like

if a system works the way you want it to (no computing errors), but still does not produce the outcome you wish for (the best possible prediction of a players skilllevel in any situation) it can still be considered a bug i think… but what we name it is only a formality after all.

1 Like

For the benefit of recent posters in this thread who may not have read the entire thread, this is a set of closely-related bug reports, the central one being that a stronger player (based on having played a sufficient number of games in total) can receive handicap stones against a weaker player.

I’m really not interested in a lengthy or in-depth discussion of various systems of determining ranking. I’m interested in the developers of OGS volunteering to fix the bugs that have been reported in this thread. If there is a different forum where these bugs should be reported, please let me know. I have also reported these bugs at the suggestions forum, where they are also apparently being ignored by the developers. I don’t know what else to do.

I do not enable the “automatic handicap” feature of my challenges to work around these bugs, but it is inconvenient, perhaps unfair, not to allow weaker players their proper automatic handicap stones.

1 Like

How the fuck do you come out of “a 40 year career in software engineering” unable to distinguish a programmer’s mistake (“bug”) from “any situation that does not sit well with me”?

I can forgive the masses of random non-techies who cry bug at every corner. They are throwing around a catchy word, ignorant and unaware that it is an insinuation of incompetence on the part of the developer.

If you have any experience in software development, you know that a bug is a deviation in the behavior of the system implementation from the specification of the system. You are complaining about a design flaw, not a bug. Even if you call it a bug a thousand times and “a few people have admitted it is a bug” (lmao) that does not make it so. OGS ranks work as designed.

If you had done your due diligence on this topic before you spammed this thread to 60 posts, you might have found the thread where the idea originated as well as a more recent discussion thread, and others like it, in which the negative effects of the separate ranks have been discussed at length.

Let me take this opportunity to repeat my statement from that discussion, because it is still current:

5 Likes

Being uncivil in a public discussion violates the guidelines for this and any intelligent forum.[quote=“Animiral, post:61, topic:9018”]

You are complaining about a design flaw, not a bug.
[/quote]

If the requirements and design phases of a software development project are done perfectly, and perfect implementation is done, the resulting product cannot have any bugs.

A bug is an error, flaw, failure or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. Source

It is important to identify bugs objectively and specifically so they will be fixed. That is why any good company or organization that produces software maintains a bug reporting system of some sort.

IMHO, this is what happened:

The game between you and SUNDJER happened at 6:23 AM PST, September 3. At that time, his Blitz rank was 23k. Then, probably after you reported it to the mods, at 8:55 AM PST, September 3, one of the mods, mlopezviedma, adjusted his Blitz rank to 14k (Please refer to his rating graph).

Therefore, at the time the game happened, your Blitz rating was 16k and SUNDJER’s Blitz rating was 23k. If you’re playing on a 1919 board, the handicap would be about 6 stones. However, on a 99 board, the handicap is reduced to 2 stones (I think I read somewhere that the handicap stones for different sizes of boards are 3:2:1 for 1919, 1313 and 9*9, but perhaps somebody could help us find the documentation).

In this sense, the system works as intended.

I agree with your analysis, which shows the buggy nature of the current rank calculations fairly clearly.

If the OGS ranking system is really intended to work this way, that is, to allow two automatic handicap stones to a stronger player, making him even stronger and more likely to win, then the whole system that is intended to work this way is not only buggy, but the intention by the developer is that is be buggy.

I don’t believe this for a moment. I’ve seen evidence of the present developers fixing bugs in a very responsible way.

I’m only confused by their reluctance to recognize the bug reports in this thread, and to fix them.

Go is a subtle game of balance, and the handicap system is supposed to restore some balance to make the game sensible when a stronger player is paired with a weaker player. The current “intended system”, which acts in the opposite direction to create imbalance, must be deeply flawed, no matter how many people step forward to present defenses of it. I simply cannot understand all these rationales to avoid calling a bug a bug. Nor do I wish to understand them. Just, please, get the bug fixed so I can use the “automatic handicap” checkbox again and give only my weaker opponents the boost they need to make the game interesting for both players.

I’ve decided that this thread is wasting too much of my time. If no one wants to agree that I have reported a bug that should be fixed, then my repeated statements of the problem will have no beneficial result.

I will let my existing posts in this thread stand and hope that someone “in power” here understands them. As I have said, this bug is fairly severe. But I have had my say, and I will not post further in this thread.

1 Like

As a software developer you must know that a bug is a piece of software that under certain circumstances does not act the way it was specified or intended. You can disagree with the intention, but this is in fact the intended behavior, and therefore not a bug. Also, it should be really easy for you to win in any game with a time setting that you rarely use, maybe that helps.

Can I have a left hand HCP too?

You can call this feature and not a bug but as far as HCP calculations go, everyone has on HCP, no matter is it a fas, slow or correspondence Go with postcards and stamps.

Look at the bigger picture. Correctly implemented (not throwing errors) but flawed feature is still a bug - software does something it’s not supposed to do (calculate different HCP depending what hand I use to move the mouse).

  1. The fact that you don’t like the intended result and don’t want multiple ranks for different kinds of games doesn’t make it a bug. Repeating this wrong claim again and again without engaging in reasoned discussion shows that you are unable to distinguish between what you want and what other people might want.

  2. You are very uncivil, but at the same time complain about others being uncivil to you. I accept different standards of politeness, but double standards are never acceptable.

  3. You should have reported the problem, of course, but also asked which problems the separate ranks fix. They were introduced because a lot of people were hesitant to play a game in their non-favored mode because it would trash their rank. Myself, I have a huge difference between blitz rank and non-blitz ranks and I actually still hesitate to play the blitz rank to more than three stones rank difference because people like you might become uncivil because of the difference. If I play a lot of blitz games, my “overall rank” will be five stones stronger than if I play a lot of live games. This remains true if the different ranks are removed, so you will regard any of these ranks as “true” depending on how many games I played that day. I suspect that your answer will be that blitz games should simply not count. So, in effect, you probably want to remove a blitz rank because you were playing a ranked blitz game and didn’t like the ranking system which means that you actually do not have any alternative for the described problem.

  4. I have also had the described problem, of course. I regard it as impolite behaviour by your opponent who knows very well that they have not played a game of this speed ever and the rank cannot represent their skill. They should ask a moderator to fix their rank, just like they would do if they came back after a year’s absence and knew that they had improved by eight stones in the meantime. I don’t think that it is a good idea to fix sandbagging behaviour by making things work worse for everyone else.

  5. In my opinion, 10 minutes for 9x9 should not a blitz game, independently of ranking issues, but the borders of different game types will always be borderline, by definition.

3 Likes

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.

This thread is outdated since Jul 2017. At this point the OGS rating system changed to glicko2. The rating system changed in many aspects. All info in this thread about Elo on OGS is obsolete.