2021 Rating and rank adjustments

BHydden · March 21, 2021, 12:19am

but then we have to waste server time re-calculating every single abandoned account’s new uncertainty every year… not worth the trade IMO

GreenAsJade · March 21, 2021, 12:21am

Whether it is or it isn’t, I’d certainly prioritise fixing “rating decreases after a win” ahead of this thing…

… I’m not really clear why this isn’t a burning issue to resolve?

square_fuseki · March 21, 2021, 12:23am

I already told

BHydden · March 21, 2021, 12:24am

I don’t know how it’s still a thing at all, fixing that was the primary focus of the last rating change

BHydden · March 21, 2021, 12:25am

Oh I must have overread those parts, sorry.

Adding some kind of generic mark next to old ratings should certainly be much more achievable than recalculating all ranks every x time periods.

GreenAsJade · March 21, 2021, 12:46am

The bug is still open… not sure why we would think it is not a thing? (That bug was raised after the recent change)

BHydden · March 21, 2021, 1:03am

I’m not doubting that it is, just confused as to how seeing as we’re now on single game window it should be impossible…

GreenAsJade · March 21, 2021, 4:27am

What I don’t understand is why it doesn’t have more attention, since it would seem to say that we can’t trust the rating code yet.

But maybe everyone is just all tuckered out from rating changes…

BHydden · March 21, 2021, 4:34am

Maybe it’s still the same kind of problem but on a much smaller scale? If it’s 1 in 1,000 instead of 1 in 10 (Numbers made up) then you would naturally expect less noise

GreenAsJade · March 21, 2021, 4:37am

Actually, there was a reasonable amount of noise initially about the problems in that ticket, and also the apparent humble rank problem(s).

I think everyone just got tired out chasing these and/or accustomed to their effects (for example, high-ranking beginners on the site).

meili_yinhua · March 21, 2021, 5:25am

This comes from the assumption that glicko-2 is a well-defined model: that adding a win to a ratings period would increase it relative to if there was not a win. I don’t remember exactly if there was any proof of that, but I could get on it.

meili_yinhua · March 21, 2021, 5:39am

I realize this wording is not clear: what I mean is that given a set of games G that have already occured in the ratings period, a starting rating m, a starting RD p, and a starting volatility v, there is a glicko-2 update f(G,m,p,v), and the assumption here is that with a set G’, which is just like G but with another win (i.e.: a game with score 1, which will always be higher than the expected score), that f(G’,m,p,v) >= f(G,m,p,v) where m, p, and v are the same in both cases. Under the sliding window, this didn’t matter because you didn’t have fixed m, p, and v , so m, p, and v could shift between games causing it to have strange behavior. However, under the current system, m, p, and v will be the same for every game within a ratings period, and as such the inequality of the assumption matters.

The reason I realize a proof might matter is that glicko-2 as we know it is more of a computationally cheaper approximation of the outputs of the mathematical model behind glicko-2, so if the math checks out, this could simply be a weakness in the approximation.

GreenAsJade · March 21, 2021, 6:18am

That would be interesting - up till now I’d just assumed it was a code bug

meili_yinhua · March 21, 2021, 6:30am

it might still be, but I can’t remember a proof that it has this property in the approximation, so there’s that. If someone gets a calculator on the scene and shows it as a counterexample that certainly makes any work of proving it easier, as then it’s clearly impossible

UPDATE: I have worked out that decrease after win can happen only if ratings was decreasing for the ratings period, and that the output volatility has gone up relative to the calculation of the previous game (which is unlikely, but I don’t know if there is 0 chance this happens), although note that it does not happen every time this is true (working with the more strong version is more complicated). Now to try to use calculus tricks on the sigma updating formula… (EDIT: wait this happens quite frequently, I may need to use a more strong version to try to prove it)

flovo · March 21, 2021, 7:48am

It’s part of glicko, but not OGS’s implementation of it. Deviation is not adjusted after long periods of inactivity.

BHydden · March 21, 2021, 7:53am

Oh weird! Why is that? Did anoek say?

meili_yinhua · March 21, 2021, 7:57am

It’s part of the code: it only does recalculations when a game happens, and there is a bit of code to handle the deviation recalculation prior to the rest. Technically glicko itself says to do this update every ratings period, but the infrastructure does not seem to lend itself to that (also the current implementation saves random spikes in processing for every unused account each week – the current length of a ratings period). Now, this is not what the user is asking (strictly), he just wants a sort of tag that enough ratings periods have passed that the RD is not correct, but it’s what I was mentioning as a response because that’s what I thought he was asking.

flovo · March 21, 2021, 9:17am

The algorithm in the repo does not adjust the deviation between games: goratings/analysis at master · online-go/goratings · GitHub

benjito · March 21, 2021, 2:37pm

I mean technically, we don’t even have a ratings period! In the past it’s just been a fixed number of games (15?), but even that has been done away with.

I believe the intended use of the rating period in Glicko was to have everyone on the same rating cycle (e.g. 1 month or 1 tournament), and ratings calculated in bulk at the end of each period. Obviously that doesn’t work so well for this use case since everyone is on a different timeline, but I just thought I’d point that out.

meili_yinhua · March 21, 2021, 8:24pm

In the most recent update of OGS glicko (the one of this thread), we got a 1 week long ratings period. While it may not seem like that, the effect is the same: it takes the rating parameters from the beginning of the period and calculates all the games since the start of it, each time you play a game it adds to that list for “this ratings period” and your “current rating” is just a calculation of what your rating would be at the end of the period if you added no more games to this period.