Bug: wrong ranking

temifar · September 5, 2016, 1:53pm

Personally, I do not see why weak amateurs like me should bother. Even against the same familiar opponent I could be winning by 30 points one day and loosing by 30 points the other day. With my “strength” fluctuating so wildly, I can’t see how it can be measured accurately by whatever ranking system.
Still, a go site should strive to provide reasonable handicaps…

david265 · September 5, 2016, 2:05pm

I respect that this is your experience, but it is not true for me. My playing on 9x9 blitz (which as I said is not blitz at 10 minutes) is very consistent with the ranks for blitz. The problem is that OGS calculates a special rank value that is buggy. Game restrictions, players’ desires, and handicap stones are all handled incorrectly. Probably a simple change to the ranking system would fix these bugs, no one is saying this clearly enough for the developers to take action.

sTan · September 5, 2016, 3:36pm

hm. maybe there are more problems at once:

a real software bug like david described, (I dont know about that, its should be checked on) and the splitting of the ranking system.

I regonized: this is not the first conversation about the ranking system.

For me it wasn’t the first time either, that something happend like this. @ S_Alexander I played over 4000 games here on OGS and normaly I will deal with it quietly and try to see it as a chance to learn to deal with it. But it is a systematicly flaw that happend and it will keep happen. It is everytime a little bit frustrating, and my post above was an expression of a kind of helplessness and the mental overload to deal with it again and again and again. (not everytime, but with no light in the end of the tunnel)

That is why I don’t want to be quiet about it (like often before). If the developers decide there will be no change, then I will furthermore deal with it. But please don’t imply that people shouled rather deal with a flawed system then pointing the flaws (in their opinon).out.

sTan…

PS: @S_Alexander: I regonize, that it makes me a little angry, that you imply, that people who regonized a flaw should look away. I think that is not how a change could happen, societyy spoken.

temifar · September 5, 2016, 4:17pm

Well, if we see handicaps in what was created as even games, that’s clearly a bug, regardless whether the handicap value itself is correct. If that is indeed the case, it might be better to split the bug report into a separate thread, as it looks like a comletely different problem.

david265 · September 5, 2016, 6:01pm

The bug I reported is worse: players whose reported ranks are several stones stronger than their opponents are given handicaps, particularly in 9x9 10 minute games.

The handicap bug IS this thread (see the OP). And, yes, there are several other bugs. OGS needs a real bug tracking system. It also needs at some point to fix the bugs, not just discuss them to death. Just my opinion.

temifar · September 5, 2016, 6:16pm

On these two points I can’t but agree with you. )

gamesorry · September 5, 2016, 6:52pm

Personally, I believe that the ranks are multi-dimensional, which are affected by not only the game speed, but also the board size.

I think the idea of splitting ranks came from lichess (OGS developer is a big fan of lichess:) ):

https://en.lichess.org/@/FMBressac

However, you can see from the link above that, when the player hasn’t played many games in certain category, the player should be marked as provisional in that particular category. I think this should solve some problems.

Also, I think showing the category-related ranking in the games/open challenges would be better.

(Another key difference is that, lichess doesn’t have an overall rating. They only show names and titles. Therefore, they don’t have such problem of confusion. However, in go world, the titles are directly defined by the rankings, which is the reason why we still need overall ranking even if the rankings are splitted - and this causes the confusion.)

temifar · September 5, 2016, 7:15pm

Hmm, title tournaments aside, do we have any titles here? Why can’t a player be the 5th dan at 7x7, while remaining 23k provisional at 19x19? I think I’ve seen some such players on DGS. (I mean there were DDK players there who studied 7x7 and were extremely good at it, while remaining DDK on big boards)

gamesorry · September 5, 2016, 7:37pm

Oh I was referring to the similar thing in go corresponding to the ‘title’ in chess (or lichess)

Definitely a player can (and this is exactly what I tried to say, that the rankings are differed by not only the game speed, but also the board size), but the question is, what rank should we display next to the player’s name?

david265 · September 5, 2016, 8:14pm

In the real world, go rank refers to 19x19 games, played within a certain range of time appropriate for sitting together.

When a go system offers a variety of timings and board sizes, but a particular player prefers to play 9x9 games (as I do), the rank must be an approximation of what the actual rank might be.

That is why I called for researching the literature on automatic rank determination, and gave the Wikipedia article on the subject.

This is a most disorderly and inefficient thread, with great ideas coming up and then being ignored in favor of obvious or incorrect ideas.

Can’t we just list the bugs and ask that they be fixed, please?

gamesorry · September 5, 2016, 9:09pm

The OGS developer IS using ELO rating given in that article: ELO rating VS rank on OGS - #3 by matburt
They’re just applied in different categories separately.

I think most of us believe that the current system has some problem in the ranking system (otherwise there would not be so many complaints). What we are not sure is how we should fix these problems.

Solution 1: Use only one rating for all categories, at the cost of some inaccuracy or ignoring the rankings on the variations.
Most other websites do this, but as temifar pointed out, most sites have a strong bias towards either live or correspondence games. From my personal experience:

(1) LittleGolem counts all board sizes equally with the same weight, and as a result, the 9*9 bot Valkyria9 is over 10 dan: Little Golem

–Edit Begin–
//(2) DGS only counts traditional 19*19 games as rated games, and all other variations are unrated.
Thanks to @temifar for pointing it out that the statement above is not accurate. Instead,
(2) DGS tournaments tend to be unrated for non-standard games, but normal single games can be rated for any supported board size: DGS - FAQ
–Edit End–

Personally, I think the factor of the board size is more important than the factor of time here.

Solution 2: Keep splitting the categories, but improve the way it works.
Specifically, I think the provisional status should be enabled in each specific category if
(1) the player has not played enough games in that category, or
(2) the player has not played in that category for a long time and the ranks of the player in the other categories are different from the rank in this category a lot.
As SanDiego pointed out, a warning should be added in such cases.

david265 · September 5, 2016, 10:22pm

Solution 1 is too extreme. There is no need to use one rating for all time categories and board sizes. This solution is throwing the baby out with the bath water, as the old expression goes.

Solution 2 addresses some of the bugs.

But I hope more research is done. People here are too quick to jump to conclusions.

I have given several examples above of problematic rank calculations. There are thousands more that can be found by anyone with easy access to the player profile database. Problem cases should be examined by computer to see what the problems actually are, statistically.

In the first example I gave in this thread, SUNDJER, the raw ranks are as follows: 15k, 18k, 16k, 19k. For a 9x9 blitz game, his rank should have been presented as 18k.

There are two problems here: first, 18k is way too high, as his claimed rank is 14k. Second, the presented rank for purposes of challenges and handicap calculation was 23k! In case I have to point it out, 23 is not 18, and certainly is not 14. The difference between 14k and 23k is enormous.

The bug seems to be located in two places, first in the blitz calculation. This can be fixed by adjusting the constant used in the blitz calculation (as compared with 19x19 normal timing). The second part of the bug seems to be in the calculation used for the “overall rank”. Either this calculation should be fixed, or the “overall rank” should be eliminated entirely.

Hope this helps.

temifar · September 6, 2016, 4:24am

That’s not entirely accurate. DGS tournaments tend to be unrated for non-standard games, that’s true. But tournaments are still somewhat experimental there. Normal single games can be rated for any supported board size: DGS - FAQ

ajventi · September 6, 2016, 1:22pm

I like the points you make here. Since there are four different ranks, it does not make sense that playing in only one time mode will remove the provisional status from all the ranks. I believe at KGS if you go too long without playing ranked games you get a “?” rank. So I think it would make sense to keep provisional ranks if someone doesn’t play in a certain mode often enough, and have their provisional rank in be the same as their overall proven rank.

If my memory is correct games played against a provisional opponent have no effect on your rank. So at least this way if there is an imbalance, you’re helping the provisional player “fix” their ranking without harming your own rank.

BozoDel · September 7, 2016, 4:40pm

I actually like the idea of provisional ratings for different game speeds. If I’m not mistaken, OGS’ ELO system sets the K value according to rank, the lower the rank, the higher the K value. We could instead set the K value based on the number of recent games in a certain speed. So if someone has never played in a given speed, that person’s rank should be adjusted faster. And even if a person has played a lot of games in a certain speed, but stops, and then comes back after a while, that rank would be less trustworthy, and should receive a higher K.

Though, if the devs would go all the way to try to change this, I think it might be worth it to try to address the multi-dimensional ranks issue, while they’re at it. I haven’t thought much about that though.

Jadeite · September 8, 2016, 10:07pm

If I make a new account and set it to 25k, even though I’m 8k is this a bug?

The rating system uses game results and the ranks of the players to calculate rank adjustment. Your opponent hadn’t played enough blitz games for the system to calculate a realistic rank for this game type. Not a bug, but a lack of data needed to calculate it correctly.

There is only 1 ranking in Go, really? Do you always play at your rank every game you play? rank is just an average of results over time.

kickaha · September 8, 2016, 10:17pm

if a system works the way you want it to (no computing errors), but still does not produce the outcome you wish for (the best possible prediction of a players skilllevel in any situation) it can still be considered a bug i think… but what we name it is only a formality after all.

david265 · September 8, 2016, 10:54pm

For the benefit of recent posters in this thread who may not have read the entire thread, this is a set of closely-related bug reports, the central one being that a stronger player (based on having played a sufficient number of games in total) can receive handicap stones against a weaker player.

I’m really not interested in a lengthy or in-depth discussion of various systems of determining ranking. I’m interested in the developers of OGS volunteering to fix the bugs that have been reported in this thread. If there is a different forum where these bugs should be reported, please let me know. I have also reported these bugs at the suggestions forum, where they are also apparently being ignored by the developers. I don’t know what else to do.

I do not enable the “automatic handicap” feature of my challenges to work around these bugs, but it is inconvenient, perhaps unfair, not to allow weaker players their proper automatic handicap stones.

Animiral · September 10, 2016, 8:15am

How the fuck do you come out of “a 40 year career in software engineering” unable to distinguish a programmer’s mistake (“bug”) from “any situation that does not sit well with me”?

I can forgive the masses of random non-techies who cry bug at every corner. They are throwing around a catchy word, ignorant and unaware that it is an insinuation of incompetence on the part of the developer.

If you have any experience in software development, you know that a bug is a deviation in the behavior of the system implementation from the specification of the system. You are complaining about a design flaw, not a bug. Even if you call it a bug a thousand times and “a few people have admitted it is a bug” (lmao) that does not make it so. OGS ranks work as designed.

If you had done your due diligence on this topic before you spammed this thread to 60 posts, you might have found the thread where the idea originated as well as a more recent discussion thread, and others like it, in which the negative effects of the separate ranks have been discussed at length.

Let me take this opportunity to repeat my statement from that discussion, because it is still current:

david265 · September 10, 2016, 12:21pm

Being uncivil in a public discussion violates the guidelines for this and any intelligent forum.[quote=“Animiral, post:61, topic:9018”]

You are complaining about a design flaw, not a bug.
[/quote]

If the requirements and design phases of a software development project are done perfectly, and perfect implementation is done, the resulting product cannot have any bugs.

A bug is an error, flaw, failure or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. Source

It is important to identify bugs objectively and specifically so they will be fixed. That is why any good company or organization that produces software maintains a bug reporting system of some sort.