OGS Ratings Revisited


#1

It’s been over 2 years since Wulfenia’s thread and 3 years since mlopezviedma’s article on OGS Ratings and Ranks. As we still get questions on the chat about ratings/ranks, I thought this may be a good timing to help newcomers by revisiting and clarifying OGS rating system.

  1. OGS rating system is quite similar to EGF rating system in its use of Elo except in the handling of 9x9 and 13x13 games. While OGS accommodates other odd sizes, ‘ranked’ status can be given to 19x19, 13x13 and 9x9 games only, where 13x13 game result is weighted 1/2, and 9x9 results are weighted 1/4 of 19x19 for the calculation of rating points given/taken.

  2. There is no difference in the amount of rating points given to a win by 100.5 points, by 0.5 points, by resignation, by moderator decision, or by timeout.

  3. Calculation of rating points given is done once only at the end of a game, including the determination of player rating/rank. As player rating can change during a game (especially in long correspondence games), we cannot determine the expected rating point gain/loss at the time we accept a challenge. This makes the calculation to be accurate only at the end of a game. It is not uncommon a player rank jumps up or down significantly by moderator decision, especially on new players.

  4. Overall rating/rank is not the average of other ranks. Instead, Overall is calculated with opponent’s Overall rating on all ranked games we play regardless of time setting.
    When we win 1 Blitz, 1 Live and 1 Correspondence games, Overall gains 3 games worth of Elo points, while individual time-class based ratings get 1 game worth each.

  5. When we are moving up in the ranks, we are winning more games than we lose on the average, and this condition causes Overall to move up quicker than other individual ratings on the average. If we are moving down in ranks, Overall moves down quicker. As the majority of players are getting stronger as time goes by (right?), Overall rating of our opponents (on the average) tends to be higher than other ratings, which further accelerates the Overall to move up quicker than others.

  6. As explained in the above #3, OGS rating calculation is not done periodically. So, how well our opponents do after the game has no effect on our ratings, and our own inactive period also has no effect on our rating/rank.

If you have questions, we can discuss them here, or, there is Help chat channel as well. Have a happy Go life.


#2

#3

I used to think that the different ratings for blitz/live/correspondence were a good thing, but now I really hate it.

I occasionally start a handicapped Go tournament. Inevitably there are disappointed participants giving huge handicaps to people whose correspondence rating lags their actual skill. Most recently, I asked a mod to fix some problems before the tournament started.

I’m worse at live games than correspondence and worse at blitz games than live. But most people are. If a poll were started today, I’d vote for just one rating.


#4

Personally I agree that I don’t like ratings being separated by game speed. As someone increases strength in live game you’d need to keep asking a mod to sync up your other speeds so you don’t unintentionally sandbag people if you do play those other speeds, or do handicap games/tournaments.

I wouldn’t mind different rates based on board size, so you could do rated 9x9 events and not damage your 19x19 rank, for instance. (Or inflate your 19x19 rank higher than it should be just because you can win a lot at 9x9.)


#5

Well, maybe it would make more sense to change the weighting for correspondence and blitz games (as it is with 13x13 and 9x9 games) instead of re-unifying rank?

Online I play correspondence only, and while I have meanwhile scratched at 9k I don’t really feel comfortable with saying that I’m a SDK (also because I use analysis extensively).


#6

OGS used to have only one rating, which carried over to the present day as Overall, and there has been no change to how it is calculated afaik. The time-class based ratings have been described as ‘experiment’ by the developers.

The fact 13x13 and 9x9 games have always been a part of our rating might imply that the importance placed on these sizes was a part of the plan to build the initial OGS character/differentiator, which may have a lot to do with the reputation that OGS is the Go server for beginners.

Now that OGS has been successful to the point it is the largest Go server outside of China/Korea/Japan, it may be time to reconsider this in order to attract stronger players at the expense of possibly alienating some 9x9 only and 13x13 only players.

Single/multi rating issue might be less important to this critical decision for OGS to grow further.


#7

I don’t think 9x9 or 13x13 being rated or not makes any difference to strong players, or really anyone. If you don’t want 9x9 or 13x13 to affect your rating just don’t play rated 9x9 or 13x13.

I am in favor of getting rid of the different ratings for different time controls. First, it’s just simpler that way and removes confusion. Second, the idea that it’s more accurate because someone is better at correspondence than blitz is totally overshadowed by the opposite case where the ratings are far less accurate because people don’t play enough games to keep 3 separate ratings up to date.


#8

I have floated this suggestion multiple times in the past. It has yet failed to gain traction with the developers in charge. Maybe this time is different :slight_smile:

You can help the cause along by contributing your vote to this feature request on UserVoice.


#9

That’s why I wrote it might be a good idea to have Blitz and Correspondence influence the overall rank not as much as “normal” live games do … but I see how it may be hard find the appropriate factor … but then again it will also be hard to compare exclusive correspondence players’ ranks to exclusive live players’ ranks … maybe just assume that the live rank of an exclusive corr. player is just two kyu weaker?


#10

voting closed. issues moved to github


#11

Hello Tokumoto,

You wrote:

Is the amount of given rating points to the winner always equal to the amount of withdrawn rating points from the loser?

It looked very strange to me if it were not be equal, but I prefer asking :slight_smile:

Thanks

Arnaud


#12

OGS uses a slightly modified Elo system with the injection of additional rating points at lower ranks that tapers off as you increase in rank. This is because Elo is designed to be zero-sum, so adding new players causes rating depression without such a modification. So, the answer to your question is ‘yes and no’ depending on your current rank.


#13

Arnaud,

Let’s assume Player A of Live rating 767 wins a ranked Live 19x19 game against Player B with the Live rating of 1000.

New Rating = Old Rating + kFactor x (Result - ExpectedResult)
where Result is 1.0 for a win, 0.5 for Jigo, 0.0 for a loss
ExpectedResult is the win probability from 0.0 to 1.0 as defined on
the EGF rating calculation.
(Player A win probability against someone 1000 - 767 = 233 rating points stronger is about 20.0% = 0.20, Player B win probability against someone 233 rating points weaker is about 0.818. -this asymmetricity reflects the fact stronger players tend to have a more stable performance, with fewer “upset” victories or defeats, as stated on the page)

OGS kFactor = 122 - (6 x R1) + (R1 x R1 / 15)
where R1 is (rating / 100)
(kFactor is 1/2 of the above 19x19 formula for 13x13, 1/4 for 9x9)

Player A live rating is 767, R1 = 7.67
so kFactor (Player A) is 122 - (6 x 7.67) + (7.67 x 7.67 / 15) = 79.90
kFactor (Player B) is 122 - (6 x 10.0) + (10.0 x 10.0 / 15) = 55.33

New Rating (Player A) = 767 + 79.9 x (1.0 - 0.2) = 830.92 (net +63.92)
New Rating (Player B) = 1000 + 55.33 x (0.0 - 0.818) = 945.49 (net -54.51)

I hope I didn’t make mistakes in the above calculations.


#14

Hi, i am a C programmer, i translated “the calculation” given by @Tokumoto in C language (i know Python as well) and i see that it is always correct for the winner but always wrong for the loser. I tested the calculation only in flash game, so it’s highly improbable a changing in player’s ranking during the game.
Are you sure that the calculation is correct?


#15

dreffuz,

Thanks for pointing that out. I’m not sure how ogs production server ver.5 implements it. This would be best answered by @matburt or @anoek, but I’d hate to let them use their time on something that does not improve the way ogs works.

I checked several actual examples and some came up correct for the loser, but not for the winner, which is the opposite of what you described. And here is an unproven theory:

When the rating increase for the winner is +5.5, and the decrease for the loser is -3.1, increase a half of the difference in absolute values to the smaller side (smaller in absolute value). In this case 5.5 - 3.1 = 2.4, 2.4 / 2 = 1.2, 3.1 + 1.2 = 4.3, so the loser rating gets -4.3 deduction.

In the case of +2.9 and -4.8, 4.8 - 2.9 = 1.9, 1.9 / 2 = 0.95, 2.9 + 0.95 = 3.85
so the winner gets +3.85 and loser gets -4.8

This hypothesis seems to fit my spotty examples. How does adding this adjustment to your program work?


#16

Hello,

Thanks for the details you gave, @tokumoto :slight_smile: Whatever the debate that’s following now :slight_smile:
I keep on reading :slight_smile:

Arnaud


#17

@Tokumoto, it seems to me that there is an error in your calculation of the kFaktor for player B:
122 - (6 x 10.0) + (10.0 x 10.0 / 15) = 68.66, not 55.33
so the new rating for B is 943.83, not 945.49


#18

Thank you. You’re right.

kFactor (Player B) is 122 - (6 x 10.0) + (10.0 x 10.0 / 15) = 68.66
New Rating (Player B) = 1000 + 68.66 x (0.0 - 0.818) = 943.83 (net -56.17)

Using the adjustment, this will result in
Player A addition 63.92
Player A New Rating = 830.92
63.92 - 56.17 = 7.75, 7.75 / 2 = 3.875, 56.17 + 3.875 = 60.045
Player B deduction -60.045
Player B New Rating 1000 - 60.045 = 939.955