9 rank difference limit to rated games - necessary?

jmstone · May 30, 2014, 8:56pm

As noted in my other post, I had an issue with not being able to join a rated game at the same rank as mine (due to a bug in rank setting on the beta server)… It got me thinking though - what purpose does stopping games being rated if opponents are more than 9 ranks different actually serve? Surely if people want to limit the rank of players that can challenge them it is easy enough to do in the other game set-up settings… This disallowing of largely different ranks in rated games seems a little unnecessary to me - and maybe it is making ranks artificially less fluid… but perhaps there is something in the maths that would break if it was allowed?

KillerDucky · May 31, 2014, 3:24am

My opinion is those games are too susceptible to noise – games that are decided not by Go skill things but by things like timeouts, or in live games something comes up and they just resign while winning etc. If 2% of all games end this way, it’s not a big deal for games where each player has close to a 50% chance of winning on Go skill. But if one player has a 1% or less chance to win on Go skill, a majority of his wins will be due to noise, not Go skill.

jmstone · May 31, 2014, 7:36am

That’s a reasonable point. I don’t think this problem affects DGS though. As far as I am aware, on that server games can count towards rating whatever the mismatch, and the ratings there seem reasonable and fluid.

After all, widely mismatched games are going to only be a very small minority of games played. Flatly disallowing them even if both participants want it seems a little unfriendly.

J

testerq · May 31, 2014, 12:58pm

I would like to point out also another possibility, another extreme if you’d like

Why not disallow setting your own rank altogether?
Everyone starts at same lvl of 30k, they can challenge and be challenged by anyone if not otherwise specified and let the system decide upon their rank as they keep playing games. Provisional players should be made more distinct and their impact on players with already solid rank should be minimal to non existent. All provisional player ranks would be quite volatile at first as the system would try its best to determine where do they fit in best on the ranking scale.

ie System could take ranked game results of provisional players and try and do binary search through fixed rank players ratings every time lowering error margin. After around 8 rated games rank should be quite solid. If that is not so, just up the amount of rated games needed.

What do you think?

jmstone · May 31, 2014, 2:05pm

I like this idea a lot too… Basically anything that allows the system to run unimpeded. I have the feeling DGS may use a similar system to this…

My feeling is if it isn’t possible for a very strong player to jump 10+ ranks in one go while provisional, we get this very uneven 10-30k ratings where people at a given rank may be a lot stronger than reported. This preventing rated games between people more than 9 ranks apart is adding to this problem IMHO. Sure, some people manually change their rank after a run of time-out losses but others prefer to grind back up - which makes the rating system rather unpredictable on this site (more so than other places I think…)

Will be interesting to see what solution the devs come up with.

wurfmaul · May 31, 2014, 3:35pm

I don’t know how it is in the main site, in the beta, provisional players don’t impact other players in rated games. What do you think about what I wrote in http://forums.online-go.com/t/more-bugs-suggestions/291/4 at 9.3?
I don’t understand how to apply a binary search.

KillerDucky · May 31, 2014, 4:11pm

If a player’s rank is off by 10+ ranks, especially if it is still provisional, the player should change his rank to a more appropriate one. The rating system here assumes that players pick reasonable initial ranks.

jmstone · May 31, 2014, 4:46pm

Yep - sure. But what if they don’t - that’s the problem. Some people feel it’s cheating (or something) to manually intervene in their rating (I don’t have an issue with it I should point out).

So should people automatically adjust their ranks after a raft of timeout losses for example? I am not sure this always happens. How can we make sure people do?

Alternatively, if rank is made provisional and very mobile after that kind of situation, their rating should be back up where it should be pretty quickly… esp if they are allowed to still play rated games vs. their old opponents.

testerq · May 31, 2014, 11:06pm

Hi wurfmaul. Yes, I imagine my binary search idea is pretty similar to one you proposed in point 9.3 of your topic.

For all interested, I will go on for a bit about binary search:

Binary search is one of simple yet very effective algorithms for finding (or placing) elements in sorted arrays. It works by first splitting an array in two and declaring middle point as pivot. It then compares its value with the value it searches for. If, for example, value being searched for is greater then the pivot, algorithm looks now only at upper half of the array and repeats the process until there is nothing more to split.

Here you can see a visualisation of algorithm at work. Pls ignore animals : )

Thus I hope you can already see how binary search fits nicely as a tool of finding right rank for provisional players. In our environment solidly ranked players present quite nicely sorted array in which we intend to find a place and squeeze in new provisional players.

I will present most basic example of finding out a proper starting rank for 9k player. Let us look at an ideal case first, when player does not to get to choose challenges he makes and instead systems serves him challenges he must complete. Player always wins against lower rank and always loses to stronger. I will also restrict placement to range of 30 to 1k rank:

In that case players first challenge will be with 15k and he will win. Then system will match him up with 7k rank and he will lose. Next challenge will be with 11k and he will win. Then next will be with 9k and he will either win or lose. Note that by this time we limited his placement to ±2 of his real rank. Whether he wins or looses will put him even closer namely at ±1 rank difference : )

Now world is not that of the ideal place. So here is more complex, more life like example how it could work. Again we will be looking into placing 9k player in between solidly ranked players ranked form 30 to 1k. This time though he will have the freedom of challenging and be challenged by whoever he wants. Also as player gets more freedom, we will get more data to consider - we will be taking game scores into account.

Let’s start. Note that now games chosen by the player act as pivot point. Lets say that he starts a little beneath his rank, challenging 11k player. Initial arrays will be maximum possible spanning same amount to both sides and covering a range from 21 to 1k. He beats his opponent with a difference of 30 points thus his new expected pivot would be middle point of 11k to 1k and as we are now taking score into account we will move him down by one rank for every 10points difference. Making his new pivot (11-1) / 2 + 3 at around 8k. As we now have some idea about his strength, games with ranks far from the expected pivot will have shifted move rates. Namely should he win no handi game against low level kyu it will move him up by smallest of amounts if any, but should he lose it would drop him substantially and up to a middle in between of expected and the kyu level he just lost against. Same goes for high level kyu games and so on.

In real life implementation of binary search we would of course have to consider more factors and apply empiric weights that work best in most cases for determining rank. For example:

Most weight (impact or meaning if you prefer) would have live games of about 45m each, then correspondence then blitz
Most weight would have games near expected rank, then handi games, then no handi games
Score difference would translate linearly to weight, resignation would have a fixed and high negative impact, timing out would have negative impact with respect to game speed (little for blitz, huge for correspondence games)
Each player could have have rank solidity weight and so on…

I hope you all enjoyed my little lecture and pls do comment and share your opinion.
Cheers!

Further reading:
Here you can find simple yet more in depth article about binary search:

And for bigger and more detailed lecture wiki is a good place to start:

anoek · June 1, 2014, 3:49pm

It doesn’t hurt to allow higher than 9 rank differences, it just doesn’t help either. People looking for a ranked game I feel tend to want a game that’s going to affect their rank, a 1d winning against a 15k though for instance is going to have a negligible effect, so it’s not worth playing a serious ranked game, better to play an unranked teaching game.

For now yes, we are going to be implementing some stuff to better handle that scenario though.

wurfmaul · June 4, 2014, 12:59am

Hi testerq.

First, about using score differences to evaluate games:
Let’s say (edit: I forgot to write something here, and I hesitated to write something because I didn’t find really good reasons against it)

About the application of the binary search:
After the 9k wins against the 11k, do I understand correctly that the new bounds for his final rank will be 1k(because that’s the current maximum bound) and ( 8+(8-1)= ) 15k?

I think such bounds are not desired.

Let’s say the provisional players “true rank” is 23k, and the start bounds are 6d(-5k) and 30k. He plays a no-handicap game against a 19k player, and wins, let’s say by a small margin. That’s unlikely, but it can happen. Then his expected rank is (-5+19)/2 = 7k, and his new bounds will be -5k and 19k.

This is already bad, there is no way he can end up as a 23k by the end of the provisional phase. I write that because I’m not(even less) sure whether my continuation fits the rules you intended.

Now, he plays a game against a 7k (since I don’t know what exactly will happen when his opponent is far from his own expected rank), and wins again - even more unlikely, but possible. Maybe the 7k had to go, and resigned the game as an act of courtesy. Or maybe he thought “it’s a provisional game, it doesn’t change my rank anyway, why should I care?”(maybe, for this reason, games against provisional players should still affect the rating of solidly rated players considerably?), and resigned. He shouldn’t do any of this, since it adds noise to the ranking system, no matter what the ranking system is, but it’s a demand of a good ranking system to be able to deal with some amount of noise. Now, if I understand right, the new expected rank of the provisional player will be (-5+7)/2 = 1k, and the new bounds will be 6d(-5k) and 7k. So even if he loses all games from now on, which is likely, he will be a solid 7k.

So, I think there should be no bounds. Maybe I understood you wrong and you didn’t want these bounds to begin with, but otherwise the system approaches elo-style systems, and I wouldn’t call it a binary search anymore.
We might get something like the Glicko rating system.
The KGS-style system(I don’t know the approptriate name), described in the KGS rating math page (for some reason, the German version of the page is more detailed) that I mentioned in the other thread should converge even quicker.

Still, having two separate rating systems, where the provisional rating system is more complex than the main rating system, seems awkward to me.

sefo · August 27, 2014, 12:57pm

I’m reviving this topic.

I just lost to a 15k because life was taking over during the game.
I lost 55 rating points in the process. In the game just before that against a 9k, I won 4 rating points.

So to get back to my original rank I will have to win 10 games in a row against an 8k or 50 games in a row against a ddk.

After battling to get a couple of points for each win it’s very demotivating.

This never happened on other servers. I never lost 1k because of an unlucky loss.

I have improved recently, I’m around 4-5k kgs and 4k tygem but soon to become 9k OGS.

Something’s wrong.

I won’t be able to play in tournaments where ddks are present anymore, because shit happens and I don’t want to lose 1 k for that.

DoubleU · April 23, 2016, 5:24pm

If you ask me, perhaps just great differences should not be disallowed in rated games, but merely result in having a lesser impact on the ratings. After all, some beginning players might need it.