Rank Instability on OGS

DVbS78rkR7NVe · April 18, 2018, 1:57pm

How much our system differs from lichess? On lichess equal game worth about 10 rating points. Here it’s maybe 20-25? And 1 stone difference is like 50?

And remember - rank gives you a sense of accomplishment. In chess people often want to reach X hundreds rating points. And with game worth about 10 points it’s a challenge. In go our goals are kyu/dan levels and looks like you can gain a rank from two lucky games.

hiryuu · April 18, 2018, 5:44pm

I’ve said it before and I’ll say it again. The rating drops/increases for different board sizes are not adequately adjusted enough to match the much higher chance for a stronger player to lose to a weaker one. Losing to a weaker player on a 9x9 pretty much drops your rank just as much as a 19x19 even when the probability for losing to a weaker player on a 9x9 is a lot lot higher thus the rank drop/gain shouldn’t be as great as it currently is.

I currently actively avoid playing weaker players because I have absolutely zero incentive to do so with additional penalties on the off chance should I lose. Sitewide smaller board tournaments are also a no-no now for the same reason too unless I’m feeling particularly risk-seeking since there are usually multiple opponents who are lower than you. I could make a new account just for different board sizes, but that would be working around the problem and not fixing it.

@S_Alexander Equal games at my rank are +/- 10 rating points per match. I need about 100 to reach the next rank. That is 10 wins (more than losses) against equal ranked opponents. Not simple at all. Even a 70% win rate would see me have to play 20 matches or so to promote once since you only get a +40 points per 10 games. You seem unsure with your figures, how large is your margin of error would you say your estimations are? I suggest rechecking them unless Glicko makes rank changes a lot high at the lower level. I also suspect the issue apart from your unreliable guesswork is that you are playing with people who are a few levels below or above you, which is not a good basis for your judgements.

DVbS78rkR7NVe · April 18, 2018, 10:23pm

@hiryuu nah, I don’t believe you. Is this your account? So that’d be around 1k.

Let’s apply this formula. You can check on your rating graph that this numbers are true.
3k = 27 rank = 2017 rating points;
2k = 28 rank = 2082 (65 points difference from 3k);
1k = 29 rank = 2150 (68 points);
1d = 30 rank = 2220 (70 points);
50 points difference between ranks is around 1550, which is 12k-11k, a lot of players. 100 points difference which you mentioned reached only for ratings 3000+, basically no one.

Now if we look at your rating graph. Apr, 13 - 1 game against “weaker” opponent (in reality, equal opponent): +20 points.

Now here’s lower ranked example. Mar, 21 - 3 losses against stronger opponents sent them from 10.2k to 11.7k (~70 points).

Maybe not two, but three equal games are enough to change your rank. Though Glicko takes into account a series of games, of course. If we had rating change written somewhere for each game it would be easier to see what happens on average. But I don’t believe your “10 games to gain a rank”.

hiryuu · April 18, 2018, 10:53pm

Obviously it’s not an exact number since I’m not doing any math so i’d estimate 7-13 wins to get a rank promotion with zero losses from the start of one rank. You don’t have to believe me. Ranks are sorted by multiples of 100 which corresponds to the the kyu/dan ranks on the left and right side of the graph in equal amounts. 13 per side. Coincidence? I think not.

I have no idea why you think looking at my account proves anything considering that the numbers are not even proper numbers in 100s and my rank ups at those points are because I actually passed the sitewide pegged rating points of 2100,2200, etc. My individual points during rank up mean nothing. I looked at your Feb to April 1st game and you do seem to drop 100 in 2 games so that’s weird. I dont remember my equal games having that big of a diff in rating.

Either way your experience seems true and I know mine is. Well it needs some fixing at the very least, together with my original post.

meili_yinhua · April 18, 2018, 11:19pm

I think you missed the Glicko update… We’re not doing that anymore…

If you’ll look, the ranks listed are not usually at even intervals.

Also, the devs explicitly state that the rank to elo/glicko conversion is 850 * exp(rank * 0.032) where 30k means rank=0…

meili_yinhua · April 18, 2018, 11:27pm

Though I will argue that we may need another look at tweaking the constants to fit recent activity (given the overall lashback about instability, tau may be too low or ranks too close)

And we may also want to consider rethinking the 1500 start point, although that may cause the rank change requests to reappear…

hiryuu · April 19, 2018, 12:40am

ah is that so? My bad then. Well then I give up trying to explain why the system works the way it does and just air my grievances.

DVbS78rkR7NVe · April 20, 2018, 10:43pm

White can’t be bothered to finish it so 13k? gets to 4k?

Eugene · April 20, 2018, 11:13pm

This one is entirely logical. The “13k” that you are talking about is +/- 6k IE it’s not really 13k at all, it is “we simply don’t know”.

If “we simply don’t know” beats “6k” what rank do you think they deserve? 4k ? could be a reasonable guess, based on a sample of one game.

The problem here is not the ranking system’s maths, the problem is that a “don’t know” who we can clearly see is actually TPK started off playing a 6k in the first place. That is the “other” problem - I guess we all know which thread to read about that one by now

If we move to the “humble rank” system, black would have been showing at most 19k (I will argue that the initial uncertainty needs to be higher) and less likely to play a 6k, though we don’t really know why a “13k” was playing a 6k either.

And, if that person did play that match under “humble rank”, their rank would now be 9k, which is hard to argue against given that as far as the ranking system knows, they beat a 6k. Presumably the next game they play the high ranked opponent will actually win, and down they will go again equally quickly.

In summary, I think we do need data to calibrate the “amount we move for any given game”, but I don’t think “13k?” games for new people are useful input to that problem.

[Edit: further, I just noticed that the 4k is also 4k? IE we still don’t really know: even more reasonable.)

springyboard · April 22, 2018, 3:07am

If the rating system has only one game to base its data on, and that one game has an inaccurate result (a SDK resigning when 50 points ahead), then the problem is simply:

Garbage in, garbage out.

springyboard · May 8, 2018, 2:42am

Another example where a new, obviously TPK, player gets an undeserved win against an SDK who resigned a winning position out of frustration:

The “winner” ended up getting a match against me.

Lys · May 8, 2018, 12:37pm

Unfortunately, this can happen anyway and for many different reasons.

I have two correspondence games in progress that are extremely slow. I don’t blame anyone for this (except me, maybe, for accepting these challenges, but how could I know before?).
I’m really tempted to resign. If i did, one of these players would get an undeserved win.
How could we avoid this? “Hey, sir, I’m bored by this game, could we please annul?”. The server itselfs can’t manage such situation, AFAIK.

And what about timeout or connection issues?

Undeserved wins can happen.

Lys · May 8, 2018, 12:40pm

Yesterday I played a 9x9 handicap game against a new user (TPK). I got 0.2k for that win.
I think it’s too much.
Maybe it will be reduced when that player will have a confirmed rank, I don’t know. But 1/5 stone for an easy win on 9x9 seems a very big reward.

smurph · May 8, 2018, 2:53pm

I understand why people are annoyed. Even if (quite possibly) the higher rank fluctuation is a more accurate model of what actually happens (a lot of fluctuation, that’s what), it is annoying. It’s annoying that I can’t say “I’m x dan” when that rank can drop 2 steps due to a single sdk botter who manages to snag a game - or from playing games when I’m tired and/or tilted.

My experience from Dota 2 is fairly similar - many people lose hundreds of elo points “just to make up for that one loss that should have been a win”. Luckily on OGS it only takes those few wins to make up for the huge deficits that come from equally few losses.

Other rating systems have other problems - ranking up on IGS takes forever (about 15-20 straight wins depending on your rank), similar numbers for the Tygem/wBaduk/Fox cluster, on KGS your rank will slowly crystallize into a never-changing number, but you can also increase your (uncrystallized) rank by losing to a much stronger player in an even game.

We’re all complaining about rating points we lost, but in the end, the fact that we have a rating system that allows us to rank up quickly as soon as we stop being as terrible at the game as we have been until now, sit down and actually work for it, I think that’s an advantage.

Lys · June 7, 2018, 3:02pm

Today I lost a game against 5d and my rank… increased by 0.1k !

meili_yinhua · June 7, 2018, 3:51pm

It happens. I think it has to do with the fact that OGS updates in sets of 15 games.

Conrad_Melville · June 7, 2018, 8:49pm

Maybe you got points for trying!

bugcat · June 7, 2018, 9:51pm

I haven’t found OGS ranks to be unstable. If you’re finding them to be so, I’d recommend that you ask yourself: am I playing…

Ranked games with bots?
Ranked blitz games?
Ranked 9x9 and 13x13 games?

Lys · June 8, 2018, 8:07am

None of them.

I dont’ play against bots
I did just a handful of live games (not blitz) in the last months
I used to play 9x9 and 13x13 but I don’t any more since january 2018.

Since then my rank went from 8k down to 12k and then up again to 8k.

I could obviously be my fault, not necessarily a bug.

Also @mekriff could be right: it could just be a coincidence that the rating system was recalculating the batch of 15 games precisely while I was losing that single game. It could happen.

But because of the already mentioned lack of informations about these calculations, we are all just making suppositions. I still really don’t understand why our beloved devs are so shy about this topic.

Let me quote myself:

Please, forgive me, but when I hear something like “I think it has to do with…” or “A lot of factors go into how each number changes”, it sounds to me like this:

jmstone · June 8, 2018, 8:45am

I must say, my rank has been much more unstable since implementation of the new rating scale - wildly swinging between 5k and 10k. I am now on around 8k, which is probably about right, but I don’t expect it to stay there for long. I think getting points for beating lower ranked players is a bit weird, as is losing points for losing against stronger players, but I realise this is due to uncertainty and instability in the other players ranks. I get the feeling this system is in some kind of wild oscillation at the moment, and doesn’t have any damping features on it. Perhaps it is just because I am used to playing on other servers where rank doesn’t move so quickly. There are definitely also problems with combining score on all board sizes and all time limits. I wonder if making the number of games over which the rank is defined larger would help to stabilize things (like 30 games rather than 15)?