Upcoming new OGS rating system

BHydden · July 14, 2017, 12:51pm

I was just wondering whether a player’s new Glicko-2 rating will be set from their existing ELO rating, or if their game history will be fully re-analysed under the new system to give them a brand new rating completely distinct from their ELO rating?
Or, perhaps some combination of the two?

anoek · July 14, 2017, 12:59pm

The game history is being fully re-analyzed

BHydden · July 14, 2017, 1:10pm

impressed whistle your servers must be looking forward to that! Hope it all goes smoothly for you

BHydden · July 14, 2017, 1:18pm

Almost half a million users have played almost 10 million games… guessing that might take a little while to /go/ through?

meili_yinhua · July 14, 2017, 6:12pm

Sounds like it’ll take a few weeks to update, but it doesn’t sound like they’ll need to take down the site.

Good on ya’ for doing Glicko-2, though. I quite enjoy the system myself.

KillerDucky · July 14, 2017, 6:50pm

Mekriff do you use a site that uses Glicko-2? Which site?

meili_yinhua · July 14, 2017, 8:17pm

It’s more that I’ve toyed around with it myself (largely by hand, but I have written programs for it) and have been pleased with the results.

BHydden · July 15, 2017, 2:03am

Will the new rating system be keeping the timeout setting, such that multiple consecutive correspondence timeouts do not cripple your rank?

Toadofsky · July 15, 2017, 10:30am

@anoek What is the rating period OGS will use? Mark Glickman recommends:

The Glicko-2 system works best when the number of games in a rating period is moderate to large, say an average of at least 10-15 games per player in a rating period. The length of time for a rating period is at the discretion of the administrator

zendo · July 15, 2017, 3:47pm

I believe chess.com uses Glicko-2.

aulavik · July 15, 2017, 11:39pm

I’m trying to modify a Chess Rating Management System for personal Go Club use. The web app currently utilize USCF formulas to calculate players’ ratings. I want to tweak the web app to use glicko2-php.

I was wondering if the developers could share some insights of how you will set up the Glicko-2 system on OGS?

What will be the initial ratings, r, and rating deviation, rd, for new unrated players?
What will be the system constants, tau? Could you shed some light on how you decided on this?
How will you correlate the rating with the kyu / dan levels?

The reason for my wanting to do this is that there will be members in Go clubs who play Go IRL but not play online. At least there is way for them to track their progress from a base such OGS.

I could use the Excel based Glicko-2 Calculator, but I would still have to manually keep a record of member’s information.

Anyways, I hope the developers could shed some light… And also tell me if what I’m doing is an overkill

meili_yinhua · July 16, 2017, 4:04am

According to Glickman’s paper, r and RD should be preset to 1500 and 350 by default. However I’m sure the seeding for rating will remain the same (this rank=this rating)

Given this is a game of perfect information, tau can be reasonably large, but Glickman recommends that tau is based on whatever value gives it the best fit in the sample data (whatever will be used for that)

The kyu/dan level rating will likely remain the same, with each rank corresponding to ~100 ELO, with 0 ELO being the boundary between 21k and 22k.

I know I’m not a developer, but I hope this is useful anyway.

mlopezviedma · July 16, 2017, 4:52am

aulavik · July 16, 2017, 5:58am

Yes, I read about the preset.

So, according to kyu/dan level rating explained here, if the r and RD is preset to 1500 and 350, respectively, it means the preset is set to 6k?

Also, if a club member who has never played go, but learned the rules, etc., can we assume that she can be given rank 30k, r = -900, RD = 350? And, since we don’t want the player to go lower than 30k, does it make sense to reset the player’s r = -900, if it goes below -900 (e.g. -1000)?

mlopezviedma · July 16, 2017, 2:17pm

Giving r = 1500 and SD = 350 means “There’s a 95% chance that this user’s real rating is between 800 (13k) and 2200 (2d)”, which is a reasonable supposition when you have no idea about that user’s rating.

But, if you can evaluate a user’s rating approximately, I would say it is wise to set his rating manually with a lower SD, say, 150. That way you are saying “There’s a 95% chance that this user’s real rating is between 3 stones stronger and 3 stones weaker than his actual rating”.

aulavik · July 17, 2017, 10:11am

According to the paper:

To apply the rating algorithm, we treat a collection of games within a “rating period” to have occurred simultaneously. A rating period could be as long as several months, or could be as short as one minute.

The Glicko system works best when the number of games in a rating period is moderate, say an average of 5-10 games per player in a rating period. The length of time for a rating period is at the discretion of the administrator.

So my question why is it better to process the games as a collection instead of individually? I understand that if a player is in a tournament, the players games should be process as a collection. However, aside from tournaments, what is the benefit?

(I have a feeling I’m hijacking the thread)

mlopezviedma · July 17, 2017, 10:16am

It’s not better or worse, it’s just another way of modelling how reality works. In terms of strength, you can regard a last set of games as being played simultaneously, and the formulas are specially suited for that purpose.

In fact we can say that Glicko allows both models to live together within the same system: You can regard each game played as a period where the two players involved are the only ones existing in the system (so treating games individually); and you can also treat a set of tournament games as a single period.

KevF · July 17, 2017, 1:19pm

Sorry if this is slightly off topic

Maybe I am just being egotistical but I like being able to go to my rank graph to see what effect a particular game had on my rank. I like being able to see that a particular win/loss gave a particular change in rank/Elo. There is a clear cause/effect in my mind.

However, when games are processed in batches, then this isn’t so clear cut. I can’t see what effect a particular win/loss will have. I know this is the same in KGS but I really dislike that there as well.

BHydden · July 17, 2017, 1:31pm

I completely agree. I’m the same all the way.

aulavik · July 17, 2017, 1:35pm

@KevF My questions to the OGS Team have nothing to do with how they are implementing Glicko 2 on OGS. It’s for more my own understanding since I’m tweaking some codes to use Glicko 2 for my own use.

What @mlopezviedma explained about processing the games as a set collection or as individual games, is just a feature of Glicko 2. It can do both.

Also, based on what I read, processing the games as a collection is usually done for tournaments and it’s usually has something to do with matching the players of relative strengths.

Anyways, I’m assuming that if and when they do make any changes, you will still see your progress on some graph.