Were you mostly sampling from live or correspondence games? Or did you get a balanced mix of both? I suspect that there are fairly large subsets of players that mostly play only live or only correspondence. It would be interesting to see if there is a significant difference between those two groups, or if those two cliques even firmly exist.
I’m not too sure what it would say if a statistically significant difference was found. Maybe those two cliques don’t really mix enough (via a third group of players that play both), so any difference could just be due to independent drift.
Maybe OGS should have some automated procedure that regenerates this histogram every 6 months or so?
OGS has a lot of new players, and it would be interesting to see whether the mode of the distribution (the peak) moves to the right with time as players improve. (Assuming that the ranking system doesn’t keep the average rank of all users constant, as some broken ranking systems do.)
There’s a very outdated histogram for KGS here: https://senseis.xmp.net/?KGSRankHistogram
I know that was a rhetorical question, but I’m pretty sure the actual answer is that it is found by some to be a useful transition from 9x9 to 19x19, so it’s “worth having”. There are also some people out there who simply like variety, and they play other board sizes as well, which aren’t even on your radar
(I can’t get my head around one board size and the infinite variety there, let alone others, but to each their own!)
Yes, I try to play 13x13 and know all too well that no one wants to play it. I bet that these 14% of players play 13x13 just because it exists as official ranked board size (and it has its own correspondence ladder). My point is that very small percent of people actually want 13x13, it’s more or less “hey, it exists, why not play it?” For example, if ogs didn’t have 19x19 as ranked board, people would want it implemented, if ogs didn’t have 9x9, people would want it. If ogs didn’t have ranked 13x13 (and it was just an irregular board size), no one would care in particular.
Edit: I wonder what numbers will we get if we exclude all ladder and automatic tournament games.
Notice that since I count players here, not games percent of correspondence is higher. Probably it means that corr fans play fewer games each, while live players pump out a lot of games.
I also wonder whether the way I download games affect this in any way. I go through game ids with a certain big step. And corr players often start games in a tournament simultaneously so games are created in batches. And my script can step over whole tournaments. Should be fine in the end, of course, but I keep my suspicions.
Interestingly, despite your common insistance that almost all new players will be closer to 25 kyu than 13 kyu, looking at that graph, it seems the 9 ranks above 13 kyu are all much more populated than the 9 ranks below 13 kyu
(note that the y-axis represents number of players, so it isn’t simply a matter of “stronger players play more games”)
The high number of players at specifically 13 kyu likely represents the players who have played 0-1 ranked games + the number of actual 13 kyu players
@GreenAsJade, I think you’re a bit joking here. But why there are so many 13k indeed? Because stable 13ks won’t give you this spike. This spike is caused by all unstable 13ks, in other words new players. Even one ranked game will throw you off starting 13k rating. So these people never played ranked.
@BHydden, we can check that! We can ask ogs what was rating of each player after, say, 20 or 30 games and we could build a histogram of that.
It makes sense: I play almost only correspondence. I can play many games at a time, but they usually take months to end. So eventually I’m pretty sure I play fewer games than a live player that plays one game at a time but regularly.
I really like your charts and would like to make my own. Could you explain how do you retrieve data?
What are the units on your pie charts, especially the first one? I assume it is the number of games played.
If you could get the “time spent thinking” on every game and compare that, it might make for a better statistic. What I mean is the time that I am looking at the game in my browser window, regardless of who’s clock is running.
Another refinement for your rank chart: as mentioned before, the 13k spike is due to accounts without (m)any rated games. You might want to filter the data by some threshold of rank uncertainty, and then also distribute every data point among the rank “baskets” according to their probability (e.g. a 2k ±1 might be 3k or even 1k).