Rank Instability on OGS

I just deleted a topic with a same name, due to several flags and community feedback, but I would not like to dismiss a potential issue due to seeming rudeness and differences.

There is always bound to be some fluctuation in ranks due to personal strengths/weaknesses, unfamiliar time-settings or mood, but if you think that there is some bigger issue with players of the same ranks, yet vastly different skills, feel free to post your experiences or ideas to what can be causing the issue here (with due politeness of course :wink: )

OGS is using https://en.wikipedia.org/wiki/Glicko_rating_system
and more (possibly relevant) information can be found by looking through this thread: I think the 13k default rank is doing harm

3 Likes

Perhaps one cause of the problem, if there is one, is the ranks of bots.

One example, if I remember rightly, was an acquaintance of mine who went from his usual rank of 11k all the way up to 7k solely from repeatedly beating Mantis in ranked games. His view was that it was over-ranked by about five stones. Here are three related facts:

  1. Some players play a lot of games with bots, DDKs especially.
  2. Some bots are significantly over-ranked.
  3. Games are ranked by default in OGS challenges.

When taken together, one source of over-ranking becomes clear.

4 Likes

Perhaps the timeout rules are another source of ratings distortion.

A correspondence player can artificially inflate their rank by escaping from multiple losses through exploiting the timeout annulment rule. There has been significant discussion about this issue in other threads, such as:

3 Likes

Indeed!!! We really should do something about the bots probably :smiley:

Some of the bots are downright cheating to rank up :smiley:

2 Likes

The first one is something that will continue to be something that will be abused unless we switch to something like Whole History Rating…

The idea that some bots are over-ranked is odd… Bots should follow the same ranking system (unless I am misinformed and they are fixed-rank) as the rest of us, so the idea of a bot being “overranked” is odd.

2 Likes

Thanks for reposting this. (I think?) I was the OP.

I made the question because I had just played a game where I felt my opponent was around DDK level (ranked 5k like me) and when they continued to play strangely and got into an awkward position I made fun a bit and then they timed out.

The following game was against someone, also 5k, who felt like a dan level player. I’m a member of the University of Toronto Go Club and we have many dan level players, I know the feeling of playing against them and I felt a similar strength from my opponent.

This isn’t an isolated incident I often feel this way playing on OGS and it makes the experience extremely inconsistent. I’m not salty that I won one lost one, the second game I had was really fun even though I lost.

I’m not sure exactly what causes these ranks instabilities, but my best guesses are smurfs, different timezones having a sort of relative strength to their time, and the way a new account has to rank up from some arbitrary DDK level. The latter I’m sure deters higher ranking players from joining the server.


MOD EDIT: Please keep this thread on topic and do not distract from it with other issues.

1 Like

The opponent that you described as “around DDK” has 600 games all at the SDK level. His play style was odd, but that doesn’t mean that he is “not appropriately ranked”. It’s what games that you win or lose that determine what your rank is, not what your play style is. Maybe he had one too many drinks that night before playing you…

GaJ

5 Likes

That’s the only way to play after a hard day in the real world
:grin:

2 Likes

I’m a 5k right now, and I definitely have days when I play at a DDK skill level.

3 Likes

Perhaps another reason might be that some people play a lot of correspondence games, which I think is ‘easier’ to be good at. I could imagine those players to get slaughtered if they try to play a fast game against some blitzer of similar level.

As someone who plays correspondence for the majority of time, that’s definitely my experience. During live games I suffer from tunnel vision and often forget to evaluate the whole position, while with correspondence, you already forgot what your plan was, so you have to inspect the whole situation with every move. And you can take a break as soon as you lose concentration, or spend a whole day pondering about whether your corner died or not, etc.

1 Like

The correspondence vs. live game dichotomy can work both ways. Someone may have a much stronger rank because of a lot of correspondence play, and a live opponent will think he is over-ranked. Or if he plays more live games, giving him a weaker rank, a correspondence opponent may think he is sandbagger. (Yes, I know there are ranks for the individual formats, but most people look mainly at the overall rank, I suspect.) No doubt this is stating the obvious to old hands here, but it is one factor that may cause less experienced players to think the system is out of kilter.

1 Like

Oh, that’s a great point and very true.

If we had separate ranks for Live and Correspondence, I wouldn’t have to have two accounts here!

3 Likes

A good solution. Maybe I’ll do that if I ever start playing much live.

1 Like

New players are automatically “ranked” 13 kyu and the worst possible rank is 25 kyu.

As a result, a TPK who joins this site cannot create or join open challenges against fellow TPKs and must first suffer repeated thrashings by established DDKs, only to realise that TPK ranks are just as distorted (some 25 kyus are actually 30 kyu strength and their opponents get more “ranking credit” than deserved for beating them).

Overall rank is a mess. :slight_smile:

You have in it:

  • all board sizes (9x9, 13x13, 19x19 and so on)
  • all timesetting (live, blitz, correspondence)
  • all undeserved wins by timeout (yeah, I know many complain about the opposite - say not gaining rank from serial timeout - but my experience is that I became SDK undeservedly just because of many stronger opponent doing timeout)
  • strange things happening when changing rating system (that’s another time I became SDK undeservedly)
  • what else?

I’m 12k now and I thing it’s someway overall correct. But it doesn’t fit for all circumstances.

I started playing only 9x9, then added 13x13, then 19x19. My “real” rank on 19x19 board was way lower than my official rank at that time, because I gained it on smaller boards.
I started playing only correspondence games. Now I do some live game and at the moment I feel very “overranked” in live timesetting. Meanwhile I see in correspondence game I can challenge 10k-8k with some chance of winning.

My very humble opinion is that the overall rank doesn’t fit for all circumstances, so it would be better to have separated ranks for each couple of “board size + timesetting”. It can be confusing, I know. So, we can keep just one overall rank, but we must know that’s not so reliable in different context.
Some players use separated account for live and correspondence.
Other players use separated account for different board sizes.
Me, I just stopped playing 9x9 and 13x13 because I was curious to know which was my “real” rank on 19x19 (so candid! :slight_smile: )

So, rank isn’t instable, it’s a hotchpotch! :smiley:

7 Likes

I just gained 0,5k for winning against weaker opponent (12k vs 13k). Isn’t it too much?

11,5k before. 11,0k after.

1 Like

Why is that too much? At 1 kyu of difference, you should win something like 60% of games (don’t remember exact number). So each time you win against someone 1kyu lower, it indicates that your rank should be at least 1 stone higher, otherwise those wins will add up to more than 60%.

Winning against someone 1 kyu lower than you is a substantial achievement. It’s not “oh, he was lower, therefore I should not rank up”.

It’s not till the gap is something like 4-5kyu that you would expect the effect to be small.

In the memes thread a posted said they ranked up on a streak against “much lower” but they didn’t say how long the streak was or how much lower, so it’s just hearsay.

2 Likes

I would expect that winning against opponent 4-5kyu lower had a minimum effect. The same for losing against 4-5kyu stronger.

I would expect too that a win against 1k lower had a little effect just because I’m expected to win about half the times.

It seems to me that 0,5k is a very big step to be gained in just one game.
I would expect that for winning against 4-5kyu stronger (especially because it actually happens).

From a statistical point of view, if I gain 0,5k 100 times and lose 0,5k 100 times my rank will be the same.
But if I play against 13k as described above and win 4 games in a row and then lose 4 games in a row, my rank will fluctuate by 2-3k up and down.

This thread discusses about rank instability and this seems exactly the case.
Actually I feel that my rank jumps up and down very quickly.
Look at the chart: 4 stones up in about 1 month.

Then 4 stones down:

As we can see, wins and losses aren’t even in those periods. But 4 stones up and down looks like rollercoaster to me. :slight_smile:
Instability. That’s it!

1 Like

Let me add this one: when I was 9k I accepted a handicap game against 15k.
Six stones handicap!
I felt badly looking at those stones on the goban.
I felt even worse when, a month later, I was back to 12k and all those handicap stones were no longer appropriate! :slight_smile:

(well, they weren’t from the beginning, but it was my fault to accept that challenge)

In the “up” period you won 20 games to 6 losses.

In the “down” period you lost 20 games to 10 wins.

I thought to myself “a 4 stone adjustment after these streaks seems reasonable”.

However, it is not so simple. In the winning streak, you were mostly winning against weaker!

And in the losing streak you were mostly losing against stronger.

This has made me think: hmm - this might be not ideal. Surely a winning streak against weaker should not deliver such a rise, nor a losing streak against stronger deliver such a fall.

I think this is some evidence that “someone” could do with a deeper look at this.

1 Like