2020 Rating and rank tweaks and analysis

Is this correct?

New Rating = Rating 15 games ago + Magic(last 15 games, Rating and Deviation and Volatility 15 games ago)

8 Likes

This was all making sense

Now I’m completely confused again!

Is it right to think about each game as contributing an amount of points (or taking them away if you lost) and so what matters to you change in rank after any given game is the number of points you earned (or lost) from your most recent game compared to the points earned (or lost) from the game that fell out of your window.

So if you won against someone a bit weaker and earned say 3 points but the game that fell out of the window was a glorious triumph against a much stronger opponent that earned you 10 points, you’ll be 7 points down due to your win.
If it’s something else I’ll have to give up trying to understand


1 Like

I treat it more like a vector, but even as that, it still depends on your rating, RD, and volatility at the beginning of a ratings period exactly how much effect it has

yep

4 Likes

Yeah, a game contributes amount of points given your rating (dev, vol) 15 games ago. Since from game to game this previous rating changes, your expected performance for the game changes too, so it’ll give different amount each time.

3 Likes

The last game is not a defeat - Last 15 games are wins, including the last one. The 16th game (from last) is a defeat, in my scenario. So, the period was 1 defeat, then 14 wins. After the last win, the period becomes 15 wins in a row.

L-W-W-W-
-W (all against hypothetical even opponents within the last 90 days).

From what I understand, when you play the game that drops out the defeat and the last 15 games become wins, the rank change will be (almost)neutral; because last 15 games will each have x1 effect on the rank change from the base rating (which doesn’t include the previous x1 effects those games had) and the dropped game will have x14 effect when becoming part of the base rating.

1 Like

Just to chime in since it seems like OGS transitioned to this system to try to make the rating system more understandable and still has many confused users wondering why winning a game appeared to make their rating go down or vice versa:

If you want to stick with a “average 15 game”-ish lookback for reasons of balancing stability with allowing the rating to remain responsive to new data, how about using a 100-game lookback window but weight the games in the window exponentially, such that the most recent game has weight 1, the next game has weight (14/15), the next game has weight (14/15)^2, and so on. The total weight will be 15, so each new game added, at the time it gets added has exactly as much relative weight as it does now in terms of updating the rating - it will be one fifteenth of the total weight.

But, an exponentially weighted moving window has the property that if you append a data point to the front that has a higher value than the average value within the window so far, it is guaranteed that the new average will not decrease, and similarly it will not increase if appending a datapoint less than the current average. Intuitively, this is because you are simply downweighting everything you had before proportionally a little before you add in the new value, there is no sharp “cliff” that games can fall off of and suddenly stop counting at some arbitrary number of games back. At least, this is true for arithmetic sums, I haven’t checked if it holds if you push the values through Glicko math or whatever, but probably it shouldn’t be hard to get such a guarantee.

Actually, technically, for this to work you’d need an infinite lookback, rather than 100, but 100 should be good enough since (14/15)^100 ~= 0.001, so the “cliff” at the end is negligible, and would let you use the same windowing code as now, at the cost of just a little more compute (setting it to 100 instead of 15), and having your ratings math support weighted data if it doesn’t already.

Arguably this is more “complex” again, but I presume users don’t actually care about numerically understanding the magic ratings math. They just care that it behaves intuitively sensibly, such as going up when you just won, and going down when you just lost, and having it never feel entirely stuck. So
 one way or another, just architect the system to have properties like that?

5 Likes

no, the probable scenario there (assuming the loss wasn’t your first game ever) is that your rating goes up then, although there are possible scenarios where it does go down, and the math is a bit complicated to lay out exactly when

what? ELO (our old system) is far more understandable than any glicko implementation. The goal was accuracy.

I mean, how ratings periods already work is that in essence new games matter more than old ones (assuming similar RD)
 There is no sharp cliff, games that leave the ratings period don’t disappear and suddenly have no effect, they’ve already made a push, and the Expected Score function does the magic of making new ones more important.

2 Likes

FWIW I think Glicko or Glicko2 but just updating one game at a time (no window) would be a good trade off for accuracy and understanding. I reckon no matter what system is used people will complain though. :slight_smile:

5 Likes

I would agree, with one exception: you’d have to do something about the RD/volatility updating, but that might not be that much harder if you make use of the record for time between games

1 Like

You can check out lichess Glicko

https://github.com/ornicar/lila/tree/d25ef1ff638fdfc45f52ff1b56a2dda4ba518616/modules/rating/src/main (wth is that embedding)

If you can figure out what any of this mean

4 Likes

Hi Fam!

How does OGS ranking compare to other sites? Are we seen as generous or conservative or are all sites pretty even when it comes to ranking?

P.S. LOVE the site! thank you!!

3 Likes

I think this thread has the closest thing to an answer to your question.

This shows (and i’ve heard from other people) that OGS has a bit harsher rankings than other popular sites.

3 Likes

Ok thanks! Harsher is better than going the other way in my opinion! :slight_smile:

6 Likes

Would it be difficult to imlement a ‘hightest ranked’ badge or something similar that would display on the profile page. . . it might help people acept the rank volitility if they had a badge showing the highest rank they had achieved. . . just a thought.

6 Likes

Yay, more incentives to helium balloon your account.

4 Likes

as if people don’t already act as if Go rank is directly linked to both penis size and bank balance

11 Likes

That’s a great idea. I often try to view my peak rank as “Achivement Unlocked” to reduce ladder anxiety.

4 Likes

Wait playing against bots isn’t cheating

I believe they meant playing bot moves on their own (human) account.

2 Likes

Good idea. I often look at the rating graphs of my opponents for this information, because it gives me a better idea of their potential strength than their current rating.

2 Likes