tl;dr - Ratings and ranks have been tweaked a little, there should be a bit less volatility in ranks now. Ratings and ranks have been adjusted retroactively.
Main changes:
- We were using fixed windows to update ranks, considering up to 15 games at a time to update your rank. When we got to the 16th game, we’d “commit” your rank and start a new block to update your ranks. We now use a sliding window, so we look at your most recent 15 games (up to 90 days), and use that to update your rank, this helps reduce the rating deviation notably and helps smooth things out.
- With the 15 game block system, we were updating the ratings of your opponents in your block, so if you played a new 13k and you lost, but later it turned out they were actually a 5d player, that loss wouldn’t count for very much. The next game you played, your new rating would take that into account. This had some good aspects, but also made it possible for your rating to go down, even though you won a game (for instance if you lost against a 13k, but later they were updated to be a 25k). We no longer do this,
so you should no longer ever go down in rating when you win a game(see note* below) (but you will also no longer have an adjusted game if you lost to a 5d who just started a new 13k account). - Because of how we implemented things, we were not able to annul games that were not in your current block of games, and we weren’t able to un-annul games. We can now do both, annul any game, and un-annul it.
Before:
After:
* Ranks can still go down with a win, this is due to the sliding window aspect. We compute your new glicko rating by looking at what your rating was 15 games ago and accounting the results of the game since then. If you had a higher starting point at your 16th game, you’ll be starting at a lower starting point for the new rating update, and so it’s still possible for you to go down in rating after a win.
Analysis of our rating system
In August of 2017 OGS switched over to a Glicko 2 rating system and we began using a non linear rating to rank function log(rating / 850) * 31.25
. An analysis of how this system is functioning has been on my to-do list for awhile, and since I broke a good number of games a couple of months back and had to do a bit of work within the rating code to repair the damage, I figured I should finally take the time to check in on things and see if there were any other adjustments we should be doing along with the repair.
The three main questions I explored were:
- Is our rank conversion producing good results, and is there a better rating to rank function?
- Does it make sense to continue counting 9x9, 13x13, and 19x19 games toward the same overall rank like we do?
- Does it make sense to use ranks below 25k?
To do this we looked at all of our games and picked the games where:
- Both players ranks were reasonably established (deviation < 120)
- The game ended in either resignation or went to the score phase (no disconnects or timeouts)
- The game was flagged as being ranked
- White’s rank was appropriate given the handicap for the game.
There were 9.08M
games that met all of these requirements. As a side note and interesting statistic, it takes an average of 6.43
games for a player to get out of provisional status, and 12.02
games to achieve a pretty stable rank (as defined by having an RD < 120
). Overall I consider that very good and am quite pleased with how fast Glicko2 can properly establish the strength of a player.
1. Is our rank conversion producing good results, and is there a better rating to rank function?
First and foremost, how are we doing at matching players up and picking a good handicap for them? To evaluate this we looked at all of our games that met the above criteria and looked at the win rates for black. For comparison I’ve included similarly computed win rates from the European Go Database (EGD).
Overall, not too bad! We expect to see a white bias in handicap games because white is always the stronger player, and black only gets one stone per full rank difference, so we’re not shooting for 50% here, but ideally we will see something fairly flat as we increase the rank difference and number of stones given, which we do for the most part. If there is a different trend, that indicates that the spacing between ranks isn’t quite ideal.
That being said I did take some time to explore some alternative mappings to try and flatten that line a bit more and get rid of the 9 handicap dip and in general smooth out per-band rank handicap bands (which are a little noisier than the smooth line above), however while I came up with a few other comparable functions, there were none that I found that were notably better than what we already have. So overall I’m pretty satisfied with our rating system and rank mapping, and I think we can pretty much consider this to be the rating and ranking system we’ll be using for the foreseeable future until someone smarter comes along with a better one.
2. Does it make sense to continue counting 9x9, 13x13, and 19x19 games toward the same overall rank like we do?
Whether we combine ranks into an “overall” rank or use separate ranks has been a debated topic since we started this project. To answer this, I computed ratings and ranks of players considering only 9x9, 13x13, 19x19, or all games, and considered only games that met the criteria above (RD<120
etc). Then we looked at how each rating system was doing when they were predicting games, comparing the resulting win rates for black with the expected win rates for black. The closer those two numbers are, the better.
This is a set of tables for handicaps 0 through 3, each row shows a rank band from the rank listed until the next rank (so 30k-26k for the first row in each entry). There are two numbers separated by a colon (:), the first is the actual win rate (%) for black, and the second is the expected win rate (%) for black (that is to say, what the rating system thought the win rate should be). n
is the number of games in that block.
9x9 ranks predicting 9x9 games
Handicap 0 Handicap 1 Handicap 2 Handicap 3
30k 47.8 : 49.6 [n=2579] 33.3 : 49.8 [n=39] 28.0 : 48.5 [n=25] 40.7 : 48.9 [n=27]
25k 48.4 : 49.7 [n=20436] 44.7 : 48.7 [n=4417] 54.7 : 49.6 [n=311] 65.4 : 50.1 [n=104]
20k 47.7 : 49.6 [n=60398] 45.0 : 48.9 [n=32404] 51.5 : 49.5 [n=555] 63.6 : 49.6 [n=176]
15k 47.4 : 49.4 [n=207911] 44.7 : 48.8 [n=70964] 54.0 : 49.3 [n=2123] 63.6 : 49.7 [n=558]
10k 48.9 : 49.4 [n=333649] 45.6 : 49.1 [n=44339] 55.5 : 49.6 [n=1802] 71.4 : 50.3 [n=245]
5k 49.3 : 50.2 [n=83326] 44.2 : 49.9 [n=2287] 61.4 : 50.0 [n=451] 75.0 : 51.4 [n=96]
1d 52.8 : 50.9 [n=4699] 58.3 : 50.1 [n=24] 100.0 : 52.2 [n=3] --
6d 46.7 : 54.1 [n=15] -- -- --
ALL 48.4 : 49.5 [n=713013] 45.0 : 48.9 [n=154474] 54.8 : 49.5 [n=5270] 65.8 : 50.0 [n=1206]
13x13 ranks predicting 13x13 games
30k 44.6 : 49.6 [n=121] 47.5 : 49.9 [n=219] 100.0 : 48.3 [n=1] 50.0 : 47.1 [n=2]
25k 48.0 : 49.5 [n=2773] 45.5 : 49.7 [n=1756] 52.9 : 49.5 [n=104] 32.6 : 48.9 [n=43]
20k 44.0 : 49.3 [n=14530] 45.4 : 49.8 [n=11125] 44.4 : 49.4 [n=941] 42.7 : 49.4 [n=293]
15k 45.8 : 49.3 [n=60647] 46.5 : 50.0 [n=16939] 49.7 : 48.8 [n=2864] 39.7 : 49.1 [n=648]
10k 47.3 : 49.4 [n=85200] 47.8 : 50.3 [n=12933] 59.7 : 49.4 [n=2635] 48.0 : 49.9 [n=306]
5k 49.3 : 50.3 [n=17200] 45.4 : 50.0 [n=866] 58.3 : 49.8 [n=357] 57.6 : 49.9 [n=33]
1d 46.2 : 51.0 [n=249] -- -- --
6d 66.7 : 60.7 [n=3] -- -- --
ALL 46.7 : 49.4 [n=180723] 46.5 : 50.0 [n=43838] 53.3 : 49.2 [n=6902] 42.5 : 49.3 [n=1325]
19x19 ranks predicting 19x19 games
30k 48.4 : 49.6 [n=1504] 42.9 : 49.1 [n=212] 39.2 : 49.4 [n=227] 34.3 : 49.7 [n=67]
25k 45.9 : 49.5 [n=18333] 40.9 : 49.6 [n=4517] 40.0 : 49.7 [n=4083] 41.2 : 49.8 [n=857]
20k 47.3 : 49.5 [n=98224] 42.4 : 49.2 [n=20335] 41.3 : 49.2 [n=14927] 42.2 : 49.3 [n=4369]
15k 49.0 : 49.4 [n=235420] 43.5 : 48.7 [n=40980] 43.3 : 48.6 [n=26238] 45.5 : 48.9 [n=10026]
10k 49.6 : 49.3 [n=344796] 45.0 : 48.6 [n=45871] 46.3 : 48.5 [n=23862] 47.3 : 48.9 [n=9851]
5k 49.7 : 49.5 [n=245545] 44.8 : 48.7 [n=17596] 44.7 : 48.4 [n=8277] 41.8 : 48.4 [n=3881]
1d 49.2 : 49.8 [n=33502] 42.6 : 49.5 [n=2086] 47.5 : 50.0 [n=1100] 47.5 : 50.0 [n=589]
6d 48.9 : 50.6 [n=1585] 54.1 : 49.4 [n=181] 70.0 : 50.3 [n=50] 76.9 : 53.3 [n=26]
ALL 49.2 : 49.4 [n=978909] 43.9 : 48.8 [n=131778] 43.9 : 48.8 [n=78764] 45.0 : 48.9 [n=29666]
Combined rating system
Predicting 9
30k 49.2 : 49.8 [n=17573] 47.8 : 48.6 [n=2060] 49.2 : 49.9 [n=240] 68.0 : 49.9 [n=125]
25k 48.6 : 49.7 [n=43982] 45.5 : 48.7 [n=22370] 56.2 : 49.8 [n=299] 63.0 : 50.5 [n=100]
20k 48.1 : 49.5 [n=117350] 45.1 : 48.4 [n=56551] 53.7 : 49.3 [n=779] 70.5 : 49.6 [n=325]
15k 49.0 : 49.5 [n=262529] 45.4 : 48.1 [n=53642] 57.7 : 49.5 [n=1413] 72.6 : 49.8 [n=332]
10k 50.6 : 49.7 [n=206142] 47.9 : 48.6 [n=14727] 57.1 : 49.7 [n=1120] 71.7 : 50.8 [n=237]
5k 49.9 : 50.2 [n=49161] 43.4 : 49.0 [n=648] 60.5 : 50.6 [n=124] 75.0 : 49.2 [n=12]
1d 54.9 : 50.6 [n=4802] 52.1 : 47.7 [n=73] 100.0 : 51.5 [n=9] 87.5 : 53.4 [n=8]
6d 59.6 : 51.8 [n=171] -- -- --
ALL 49.4 : 49.6 [n=701710] 45.6 : 48.4 [n=150071] 56.3 : 49.6 [n=3984] 70.6 : 50.1 [n=1139]
Predicting 13
30k 49.3 : 49.6 [n=2094] 49.5 : 49.8 [n=2393] 65.6 : 49.0 [n=32] 48.6 : 49.9 [n=37]
25k 47.5 : 49.5 [n=8051] 46.9 : 49.5 [n=8310] 41.6 : 48.8 [n=308] 33.3 : 49.1 [n=33]
20k 46.5 : 49.2 [n=24573] 46.3 : 49.6 [n=20826] 49.8 : 48.6 [n=1222] 46.2 : 49.6 [n=279]
15k 47.6 : 49.2 [n=78278] 47.3 : 49.5 [n=24001] 50.5 : 48.2 [n=2349] 41.1 : 49.5 [n=445]
10k 49.0 : 49.4 [n=86805] 47.9 : 50.0 [n=13165] 57.8 : 48.3 [n=1500] 48.4 : 50.0 [n=182]
5k 50.5 : 50.2 [n=19668] 45.6 : 49.9 [n=1041] 56.2 : 48.9 [n=169] 46.7 : 47.2 [n=15]
1d 54.5 : 50.7 [n=985] 60.7 : 51.2 [n=28] 100.0 : 45.8 [n=4] --
6d 40.5 : 51.8 [n=42] -- -- --
ALL 48.3 : 49.4 [n=220496] 47.1 : 49.6 [n=69764] 52.1 : 48.4 [n=5584] 44.0 : 49.6 [n=991]
Predicting 19
30k 50.5 : 49.6 [n=2299] 44.3 : 48.9 [n=221] 36.9 : 48.5 [n=222] 31.7 : 49.5 [n=63]
25k 48.0 : 49.5 [n=10385] 42.3 : 48.9 [n=3479] 40.6 : 48.9 [n=3215] 38.3 : 49.2 [n=664]
20k 47.7 : 49.4 [n=61028] 41.1 : 48.7 [n=18301] 39.3 : 48.9 [n=15631] 39.9 : 49.0 [n=3557]
15k 48.7 : 49.3 [n=224916] 42.1 : 48.4 [n=48650] 41.1 : 48.6 [n=34843] 41.8 : 48.9 [n=12427]
10k 49.2 : 49.3 [n=410840] 43.1 : 48.3 [n=68206] 43.2 : 48.5 [n=40606] 44.0 : 48.9 [n=15422]
5k 49.6 : 49.4 [n=381050] 44.3 : 48.4 [n=32244] 44.5 : 48.3 [n=15511] 43.1 : 48.4 [n=6854]
1d 49.4 : 49.9 [n=60938] 41.3 : 49.0 [n=3580] 46.5 : 49.2 [n=2010] 44.3 : 49.0 [n=1061]
6d 49.0 : 50.6 [n=2519] 53.2 : 50.0 [n=220] 66.1 : 50.4 [n=59] 84.8 : 52.3 [n=33]
ALL 49.2 : 49.4 [n=1153975] 42.8 : 48.4 [n=174901] 42.2 : 48.6 [n=112097] 42.7 : 48.8 [n=40081]
Combined prediction of all games
30k 49.3 : 49.8 [n=21966] 48.5 : 49.2 [n=4674] 44.7 : 49.2 [n=494] 54.7 : 49.8 [n=225]
25k 48.3 : 49.6 [n=62418] 45.5 : 48.9 [n=34159] 41.9 : 48.9 [n=3822] 41.2 : 49.4 [n=797]
20k 47.8 : 49.4 [n=202951] 44.6 : 48.7 [n=95678] 40.7 : 48.9 [n=17632] 42.7 : 49.1 [n=4161]
15k 48.7 : 49.4 [n=565723] 44.5 : 48.5 [n=126293] 42.2 : 48.6 [n=38605] 42.6 : 48.9 [n=13204]
10k 49.6 : 49.4 [n=703787] 44.5 : 48.6 [n=96098] 44.1 : 48.5 [n=43226] 44.5 : 48.9 [n=15841]
5k 49.7 : 49.5 [n=449879] 44.3 : 48.5 [n=33933] 44.7 : 48.3 [n=15804] 43.1 : 48.4 [n=6881]
1d 49.8 : 49.9 [n=66725] 41.6 : 49.0 [n=3681] 46.8 : 49.2 [n=2023] 44.6 : 49.1 [n=1069]
6d 49.6 : 50.7 [n=2732] 53.2 : 50.0 [n=220] 66.1 : 50.4 [n=59] 84.8 : 52.3 [n=33]
ALL 49.2 : 49.5 [n=2076181] 44.6 : 48.6 [n=394736] 43.1 : 48.6 [n=121665] 43.5 : 48.9 [n=42211]
From this data there’s two things to note:
- Using a combined rating works quite well, certainly comparable or better than looking at per-size strengths by themselves. It seems to me like it makes sense to keep using it.
- Using overall ratings to predict 9x9 games works pretty good at HC 0 and at HC 1, indicating to me that the strength bands are pretty compatible with 19x19 or just “go ranks” in general. However, going beyond HC 1, predictions start to get bad pretty quick. I believe this is an indication that the “Old Japanese Recommendation” is not so great for us, and that we should strongly consider figuring out what the best 9x9 (and probably 13x13) handicap setup should be.
EDIT: The question arose about considering blitz vs live vs correspondence ranks, here’s the data from that, which I believe is still very supportive of using an overall rank for picking handicap.
Considering only 19x19 games:
19x19 blitz ranks predicting blitz
Handicap 0 Handicap 1 Handicap 2 Handicap 3
30k -- -- -- --
25k 60.0 : 49.1 [n=30] -- -- --
20k 40.3 : 49.1 [n=5422] 43.4 : 49.2 [n=387] 42.8 : 49.5 [n=458] 41.5 : 49.7 [n=398]
15k 44.9 : 49.5 [n=25773] 42.8 : 49.7 [n=3537] 44.1 : 49.7 [n=2679] 44.2 : 49.8 [n=2049]
10k 47.6 : 49.5 [n=35936] 44.5 : 49.8 [n=4946] 44.6 : 49.6 [n=2652] 44.1 : 49.7 [n=1563]
5k 48.2 : 49.9 [n=10003] 42.6 : 50.2 [n=1029] 45.0 : 49.7 [n=460] 38.2 : 49.2 [n=293]
1d 45.9 : 49.3 [n=379] 57.6 : 48.9 [n=99] 61.4 : 50.5 [n=44] 81.0 : 51.3 [n=21]
6d 33.3 : 41.2 [n=6] -- -- --
ALL 46.2 : 49.5 [n=77549] 43.8 : 49.7 [n=9998] 44.4 : 49.7 [n=6296] 43.7 : 49.7 [n=4326]
19x19 live ranks predicting live
30k 47.3 : 49.6 [n=886] 38.4 : 49.3 [n=146] 42.3 : 49.5 [n=130] 42.6 : 49.9 [n=47]
25k 45.7 : 49.4 [n=11854] 41.6 : 49.5 [n=3299] 40.0 : 49.7 [n=3125] 40.1 : 49.6 [n=604]
20k 46.9 : 49.4 [n=71949] 42.0 : 49.3 [n=17356] 40.5 : 49.4 [n=12812] 41.9 : 49.3 [n=3404]
15k 48.5 : 49.4 [n=184092] 43.7 : 49.0 [n=34942] 43.3 : 48.8 [n=21729] 46.2 : 49.0 [n=7439]
10k 49.3 : 49.2 [n=275964] 45.1 : 48.8 [n=37168] 46.2 : 48.8 [n=18604] 47.0 : 49.2 [n=6431]
5k 49.4 : 49.3 [n=187928] 45.4 : 49.2 [n=11581] 44.6 : 48.9 [n=4973] 41.5 : 48.5 [n=2124]
1d 48.4 : 49.6 [n=23931] 42.6 : 49.5 [n=1496] 47.1 : 50.0 [n=751] 46.2 : 50.2 [n=463]
6d 47.9 : 50.6 [n=823] 36.0 : 49.8 [n=50] 57.1 : 52.3 [n=7] 50.0 : 58.6 [n=2]
ALL 48.8 : 49.3 [n=757427] 44.0 : 49.0 [n=106038] 43.6 : 49.0 [n=62131] 45.1 : 49.1 [n=20514]
19x19 corr ranks predicting corr
30k 62.2 : 48.9 [n=37] 0.0 : 47.9 [n=4] 0.0 : 47.5 [n=2] 100.0 : 53.4 [n=1]
25k 47.2 : 49.4 [n=619] 48.8 : 49.0 [n=41] 51.0 : 49.7 [n=51] 61.9 : 49.3 [n=21]
20k 48.5 : 49.7 [n=5363] 45.5 : 49.8 [n=277] 44.0 : 49.5 [n=282] 45.1 : 49.9 [n=182]
15k 48.9 : 49.7 [n=18366] 44.6 : 49.8 [n=1029] 42.7 : 49.5 [n=975] 47.1 : 49.7 [n=690]
10k 50.1 : 49.8 [n=28380] 44.7 : 50.0 [n=1346] 49.4 : 50.0 [n=1147] 52.3 : 50.1 [n=710]
5k 52.1 : 50.3 [n=15582] 39.5 : 50.2 [n=511] 48.7 : 50.3 [n=417] 42.6 : 49.6 [n=263]
1d 51.7 : 50.6 [n=2695] 42.1 : 50.4 [n=57] 47.8 : 50.7 [n=23] 41.2 : 51.4 [n=17]
6d 55.2 : 51.1 [n=134] -- -- --
ALL 50.1 : 49.9 [n=71176] 43.9 : 49.9 [n=3265] 46.5 : 49.8 [n=2897] 48.4 : 49.9 [n=1884]
19x19 overall ranks predicting blitz
Handicap 0 Handicap 1 Handicap 2 Handicap 3
30k 50.4 : 49.6 [n=125] 23.1 : 48.7 [n=13] 44.4 : 50.4 [n=9] 83.3 : 49.7 [n=6]
25k 43.3 : 49.4 [n=1189] 38.9 : 50.0 [n=90] 35.3 : 50.4 [n=85] 35.5 : 49.7 [n=93]
20k 45.8 : 49.3 [n=8790] 42.0 : 49.1 [n=659] 45.0 : 49.2 [n=644] 42.0 : 49.7 [n=395]
15k 48.9 : 49.4 [n=21681] 42.1 : 48.6 [n=2786] 44.0 : 48.6 [n=2259] 47.5 : 48.7 [n=1688]
10k 49.5 : 49.4 [n=33029] 44.9 : 49.1 [n=5802] 46.9 : 48.8 [n=3376] 46.3 : 48.7 [n=2404]
5k 50.3 : 49.8 [n=35639] 44.6 : 48.4 [n=4863] 45.4 : 48.2 [n=2526] 43.6 : 48.3 [n=1225]
1d 49.1 : 50.0 [n=4395] 44.7 : 49.6 [n=385] 47.6 : 49.8 [n=212] 48.7 : 49.7 [n=113]
6d 48.5 : 49.4 [n=268] 64.8 : 49.7 [n=105] 75.6 : 50.4 [n=41] 88.9 : 53.1 [n=18]
ALL 49.2 : 49.6 [n=105116] 44.2 : 48.8 [n=14703] 45.7 : 48.7 [n=9152] 45.9 : 48.7 [n=5942]
19x19 overall ranks predicting live
30k 50.1 : 49.6 [n=997] 45.9 : 49.0 [n=170] 35.7 : 49.4 [n=185] 27.5 : 49.7 [n=51]
25k 45.7 : 49.5 [n=14269] 40.8 : 49.6 [n=4189] 40.0 : 49.7 [n=3781] 41.1 : 49.7 [n=683]
20k 47.1 : 49.4 [n=77569] 42.3 : 49.1 [n=18887] 40.9 : 49.2 [n=13695] 41.7 : 49.2 [n=3635]
15k 48.8 : 49.4 [n=191964] 43.4 : 48.7 [n=36924] 43.0 : 48.6 [n=22897] 45.0 : 48.8 [n=7702]
10k 49.6 : 49.3 [n=285823] 44.9 : 48.5 [n=38809] 46.1 : 48.4 [n=19351] 46.9 : 48.9 [n=6762]
5k 49.4 : 49.4 [n=194764] 44.9 : 48.8 [n=12065] 44.1 : 48.4 [n=5250] 41.0 : 48.3 [n=2348]
1d 48.8 : 49.7 [n=25554] 42.7 : 49.6 [n=1582] 47.3 : 50.0 [n=822] 48.0 : 49.9 [n=450]
6d 49.0 : 50.7 [n=966] 39.5 : 49.1 [n=76] 44.4 : 49.4 [n=9] 50.0 : 53.8 [n=8]
ALL 49.0 : 49.4 [n=791906] 43.8 : 48.8 [n=112702] 43.4 : 48.7 [n=65990] 44.5 : 48.9 [n=21639]
19x19 overall ranks predicting corr
30k 43.5 : 49.5 [n=382] 34.5 : 49.4 [n=29] 57.6 : 48.8 [n=33] 40.0 : 50.1 [n=10]
25k 48.0 : 49.7 [n=2875] 43.3 : 49.9 [n=238] 42.9 : 50.1 [n=217] 48.1 : 50.4 [n=81]
20k 49.5 : 49.7 [n=11865] 46.3 : 49.8 [n=789] 46.3 : 49.8 [n=588] 48.1 : 49.7 [n=339]
15k 50.9 : 49.8 [n=21775] 47.6 : 49.7 [n=1270] 48.1 : 49.5 [n=1082] 47.0 : 50.0 [n=636]
10k 50.4 : 49.8 [n=25944] 48.1 : 49.6 [n=1260] 48.6 : 49.7 [n=1135] 54.6 : 49.9 [n=685]
5k 52.3 : 50.2 [n=15142] 43.9 : 49.3 [n=668] 47.5 : 50.0 [n=501] 40.9 : 49.2 [n=308]
1d 52.5 : 50.4 [n=3553] 35.3 : 49.1 [n=119] 48.5 : 49.7 [n=66] 34.6 : 51.9 [n=26]
6d 49.0 : 51.4 [n=351] -- -- --
ALL 50.7 : 49.9 [n=81887] 46.3 : 49.6 [n=4373] 47.7 : 49.7 [n=3622] 48.6 : 49.8 [n=2085]
19x19 overall ranks predicting overall
30k 48.4 : 49.6 [n=1504] 42.9 : 49.1 [n=212] 39.2 : 49.4 [n=227] 34.3 : 49.7 [n=67]
25k 45.9 : 49.5 [n=18333] 40.9 : 49.6 [n=4517] 40.0 : 49.7 [n=4083] 41.2 : 49.8 [n=857]
20k 47.3 : 49.5 [n=98224] 42.4 : 49.2 [n=20335] 41.3 : 49.2 [n=14927] 42.2 : 49.3 [n=4369]
15k 49.0 : 49.4 [n=235420] 43.5 : 48.7 [n=40980] 43.3 : 48.6 [n=26238] 45.5 : 48.9 [n=10026]
10k 49.6 : 49.3 [n=344796] 45.0 : 48.6 [n=45871] 46.3 : 48.5 [n=23862] 47.3 : 48.9 [n=9851]
5k 49.7 : 49.5 [n=245545] 44.8 : 48.7 [n=17596] 44.7 : 48.4 [n=8277] 41.8 : 48.4 [n=3881]
1d 49.2 : 49.8 [n=33502] 42.6 : 49.5 [n=2086] 47.5 : 50.0 [n=1100] 47.5 : 50.0 [n=589]
6d 48.9 : 50.6 [n=1585] 54.1 : 49.4 [n=181] 70.0 : 50.3 [n=50] 76.9 : 53.3 [n=26]
ALL 49.2 : 49.4 [n=978909] 43.9 : 48.8 [n=131778] 43.9 : 48.8 [n=78764] 45.0 : 48.9 [n=29666]
Considering all games
Blitz ranks predicting blitz
Handicap 0 Handicap 1 Handicap 2 Handicap 3
30k 38.5 : 49.1 [n=78] 33.3 : 48.2 [n=6] 0.0 : 46.3 [n=1] --
25k 47.4 : 49.6 [n=1933] 43.9 : 49.3 [n=196] 49.4 : 49.8 [n=79] 62.1 : 50.6 [n=29]
20k 43.4 : 49.1 [n=21095] 43.2 : 49.5 [n=2188] 47.2 : 49.4 [n=788] 39.1 : 49.8 [n=322]
15k 45.5 : 49.2 [n=131528] 43.6 : 49.6 [n=9565] 44.4 : 49.5 [n=3662] 42.1 : 49.7 [n=1957]
10k 47.8 : 49.5 [n=136510] 44.5 : 50.0 [n=10457] 45.7 : 49.9 [n=4458] 43.0 : 50.0 [n=2779]
5k 47.6 : 50.0 [n=36841] 44.4 : 50.1 [n=3434] 43.0 : 49.6 [n=1621] 42.1 : 50.2 [n=744]
1d 50.3 : 50.4 [n=1271] 53.7 : 48.7 [n=123] 54.7 : 49.6 [n=64] 58.1 : 50.8 [n=43]
6d 52.0 : 48.7 [n=25] 0.0 : 51.9 [n=4] 100.0 : 59.8 [n=1] --
ALL 46.5 : 49.4 [n=329281] 44.1 : 49.8 [n=25973] 45.0 : 49.7 [n=10674] 42.6 : 49.9 [n=5874]
Live ranks predicting live
30k 49.6 : 49.8 [n=20325] 46.8 : 49.3 [n=5587] 43.5 : 49.1 [n=437] 56.7 : 49.8 [n=171]
25k 48.3 : 49.6 [n=39715] 45.5 : 49.1 [n=28644] 42.5 : 49.1 [n=2850] 39.9 : 49.4 [n=479]
20k 47.6 : 49.5 [n=149652] 45.0 : 49.0 [n=81606] 40.7 : 49.1 [n=14237] 44.0 : 49.4 [n=2973]
15k 48.2 : 49.3 [n=398094] 44.7 : 48.9 [n=104484] 42.1 : 48.9 [n=30877] 42.8 : 49.2 [n=9499]
10k 49.1 : 49.3 [n=483024] 44.6 : 49.0 [n=73713] 44.3 : 48.9 [n=32129] 45.3 : 49.4 [n=10203]
5k 49.4 : 49.3 [n=318943] 45.0 : 49.1 [n=21888] 45.2 : 49.0 [n=9504] 43.2 : 48.8 [n=3456]
1d 48.9 : 49.7 [n=43212] 42.4 : 49.5 [n=2279] 46.7 : 49.7 [n=1225] 43.7 : 49.5 [n=762]
6d 48.0 : 50.7 [n=1479] 40.7 : 49.8 [n=86] 50.0 : 54.7 [n=12] 50.0 : 58.8 [n=6]
ALL 48.7 : 49.4 [n=1454444] 44.9 : 49.0 [n=318287] 43.0 : 49.0 [n=91271] 43.9 : 49.3 [n=27549]
Corr ranks predicting corr
30k 60.7 : 48.8 [n=28] -- -- --
25k 51.5 : 49.4 [n=643] 35.7 : 49.3 [n=28] 48.5 : 49.2 [n=33] 41.4 : 49.3 [n=29]
20k 47.6 : 49.5 [n=10436] 44.3 : 49.5 [n=575] 47.1 : 49.7 [n=376] 43.6 : 49.7 [n=291]
15k 48.5 : 49.8 [n=34322] 42.6 : 49.7 [n=2050] 44.9 : 49.9 [n=1353] 44.2 : 49.7 [n=872]
10k 50.4 : 49.8 [n=48021] 45.6 : 49.9 [n=2192] 47.5 : 50.0 [n=1648] 48.9 : 50.2 [n=876]
5k 51.6 : 50.2 [n=29561] 42.6 : 50.1 [n=756] 47.8 : 50.3 [n=604] 46.0 : 49.9 [n=402]
1d 52.8 : 50.7 [n=5150] 47.4 : 50.4 [n=76] 59.6 : 52.2 [n=52] 54.5 : 51.6 [n=22]
6d 54.8 : 51.7 [n=283] -- -- --
ALL 50.1 : 49.9 [n=128444] 43.9 : 49.8 [n=5677] 46.8 : 50.0 [n=4066] 46.1 : 49.9 [n=2493]
Overall ranks predicting blitz
Handicap 0 Handicap 1 Handicap 2 Handicap 3
30k 49.3 : 49.8 [n=2830] 48.1 : 49.5 [n=183] 56.6 : 50.2 [n=53] 46.4 : 49.7 [n=28]
25k 47.5 : 49.5 [n=8120] 47.1 : 49.4 [n=900] 44.8 : 49.2 [n=210] 43.8 : 49.5 [n=112]
20k 46.3 : 49.2 [n=29016] 44.8 : 48.8 [n=3119] 49.3 : 48.9 [n=766] 47.6 : 49.0 [n=416]
15k 48.5 : 49.2 [n=112384] 44.5 : 48.7 [n=8508] 45.6 : 48.7 [n=3111] 41.8 : 48.5 [n=1850]
10k 50.1 : 49.5 [n=139556] 45.5 : 49.0 [n=11968] 45.8 : 48.7 [n=5262] 43.6 : 48.8 [n=3529]
5k 49.3 : 49.7 [n=73036] 44.4 : 48.3 [n=8027] 44.2 : 48.0 [n=3911] 43.0 : 48.1 [n=2182]
1d 50.7 : 50.2 [n=10274] 40.1 : 48.6 [n=893] 47.6 : 49.1 [n=525] 43.2 : 49.0 [n=257]
6d 48.3 : 49.6 [n=385] 63.5 : 49.6 [n=115] 75.0 : 49.7 [n=44] 91.7 : 51.4 [n=24]
ALL 49.1 : 49.5 [n=375601] 44.9 : 48.7 [n=33713] 45.7 : 48.5 [n=13882] 43.4 : 48.6 [n=8398]
Overall ranks predicting live
30k 49.4 : 49.8 [n=17236] 48.7 : 49.2 [n=4345] 41.9 : 49.1 [n=382] 56.4 : 49.8 [n=172]
25k 48.6 : 49.6 [n=48592] 45.5 : 48.9 [n=32577] 41.1 : 48.9 [n=3398] 39.7 : 49.4 [n=584]
20k 47.9 : 49.5 [n=158757] 44.6 : 48.7 [n=90988] 40.1 : 48.8 [n=16091] 42.0 : 49.0 [n=3351]
15k 48.6 : 49.4 [n=415690] 44.4 : 48.4 [n=115293] 41.6 : 48.5 [n=33868] 42.6 : 48.9 [n=10423]
10k 49.4 : 49.4 [n=514601] 44.2 : 48.5 [n=81733] 43.5 : 48.4 [n=36146] 44.2 : 48.9 [n=11341]
5k 49.6 : 49.4 [n=344875] 44.2 : 48.5 [n=24837] 44.7 : 48.3 [n=11061] 43.3 : 48.4 [n=4201]
1d 49.3 : 49.8 [n=49597] 42.2 : 49.1 [n=2564] 46.5 : 49.3 [n=1378] 44.7 : 49.0 [n=756]
6d 49.5 : 51.0 [n=1676] 41.7 : 50.3 [n=103] 40.0 : 52.6 [n=15] 66.7 : 54.8 [n=9]
ALL 49.0 : 49.4 [n=1551024] 44.5 : 48.6 [n=352440] 42.4 : 48.5 [n=102339] 43.3 : 48.9 [n=30837]
Overall ranks predicting corr
30k 48.6 : 49.6 [n=1900] 44.5 : 49.7 [n=146] 52.5 : 49.3 [n=59] 52.0 : 49.6 [n=25]
25k 47.4 : 49.7 [n=5706] 45.2 : 49.6 [n=682] 50.9 : 49.6 [n=214] 46.5 : 49.3 [n=101]
20k 49.0 : 49.6 [n=15178] 46.8 : 49.6 [n=1571] 45.7 : 49.6 [n=775] 43.7 : 49.9 [n=394]
15k 50.4 : 49.8 [n=37649] 47.2 : 49.8 [n=2492] 48.7 : 49.9 [n=1626] 43.7 : 49.7 [n=931]
10k 50.7 : 49.8 [n=49630] 48.1 : 49.7 [n=2397] 49.3 : 49.6 [n=1818] 50.9 : 49.9 [n=971]
5k 51.8 : 50.1 [n=31968] 45.7 : 49.5 [n=1069] 47.7 : 49.9 [n=832] 42.4 : 49.6 [n=498]
1d 52.6 : 50.6 [n=6854] 41.5 : 49.3 [n=224] 46.7 : 49.8 [n=120] 50.0 : 50.1 [n=56]
6d 50.4 : 50.7 [n=671] 50.0 : 55.2 [n=2] -- --
ALL 50.6 : 49.9 [n=149556] 46.8 : 49.7 [n=8583] 48.4 : 49.7 [n=5444] 46.1 : 49.8 [n=2976]
Overall ranks predicting overall
30k 49.3 : 49.8 [n=21966] 48.5 : 49.2 [n=4674] 44.7 : 49.2 [n=494] 54.7 : 49.8 [n=225]
25k 48.3 : 49.6 [n=62418] 45.5 : 48.9 [n=34159] 41.9 : 48.9 [n=3822] 41.2 : 49.4 [n=797]
20k 47.8 : 49.4 [n=202951] 44.6 : 48.7 [n=95678] 40.7 : 48.9 [n=17632] 42.7 : 49.1 [n=4161]
15k 48.7 : 49.4 [n=565723] 44.5 : 48.5 [n=126293] 42.2 : 48.6 [n=38605] 42.6 : 48.9 [n=13204]
10k 49.6 : 49.4 [n=703787] 44.5 : 48.6 [n=96098] 44.1 : 48.5 [n=43226] 44.5 : 48.9 [n=15841]
5k 49.7 : 49.5 [n=449879] 44.3 : 48.5 [n=33933] 44.7 : 48.3 [n=15804] 43.1 : 48.4 [n=6881]
1d 49.8 : 49.9 [n=66725] 41.6 : 49.0 [n=3681] 46.8 : 49.2 [n=2023] 44.6 : 49.1 [n=1069]
6d 49.6 : 50.7 [n=2732] 53.2 : 50.0 [n=220] 66.1 : 50.4 [n=59] 84.8 : 52.3 [n=33]
ALL 49.2 : 49.5 [n=2076181] 44.6 : 48.6 [n=394736] 43.1 : 48.6 [n=121665] 43.5 : 48.9 [n=42211]
3. Does it make sense to use ranks below 25k?
To answer this we looked at the win rates of handicap games between people that would be in the 30-25k range and compared them to other ranks:
This hump of the purple line, the 30-25k players, is black winning an unexpected amount when they are given handicap stones. One possibility for this that I explored a fair amount is that perhaps this is indicating that our rank could be improved down in this range to smooth that out to something more expected, however I was unable to find any fit that didn’t result in similar amounts of chaos. I think this is because there are other dominating factors beyond just the number of stones that are given, namely white doesn’t necessarily know how to approach handicap games yet, seeing that many stones probably psyches them out a bit, and blunders still matter more than a few extra stones most likely. So, basically I think the main purpose of the rank (being able to use it to calculate how many handicap stones you should give) begins to fall apart at this range.
That being said, it’s not all that bad, and there are some benefits to having those ranks beyond strictly correct functional computation of handicaps, namely so people can see their progress. I don’t know that we should go any lower than 30k, but we might very well bring back ranks down to 30k just so people don’t feel like they are perpetually stuck at 25k. We may still stick with just not giving handicap stones automatically to players in that range, or reduce how many are given, but all of that are things to determine later.