It makes sense: I play almost only correspondence. I can play many games at a time, but they usually take months to end. So eventually I’m pretty sure I play fewer games than a live player that plays one game at a time but regularly.

I really like your charts and would like to make my own. Could you explain how do you retrieve data?

What are the units on your pie charts, especially the first one? I assume it is the number of games played.

If you could get the “time spent thinking” on every game and compare that, it might make for a better statistic. What I mean is the time that I am looking at the game in my browser window, regardless of who’s clock is running.

Another refinement for your rank chart: as mentioned before, the 13k spike is due to accounts without (m)any rated games. You might want to filter the data by some threshold of rank uncertainty, and then also distribute every data point among the rank “baskets” according to their probability (e.g. a 2k ±1 might be 3k or even 1k).

I think filtering by uncertainty would be a great move.

In fact, it’d be interesting to see a histogram for ranks over uncertainty 2.5 and ranks under 2.5 (I chose that number because that’s the biggest sort of number I’ve seen for people who are playing regularly and have stabilised, but it could be any number in that sort of range)

@Animiral, @GreenAsJade, “time spent thinking” wouldn’t work with correspondence. And I’m not sure what else we’ll get from that. 13k spike is basically because of people with 0 ranked games (even 1 game is enough to throw off you from 13k). Distributing ranks to bins according to probability that they have one rank of another sounds way too complicated to be honest. Simple filtering is better, but I don’t think in these particular graphs it matters much. For example, for overall rank histogram I separated them, in post №1. In terms of filtering I think it’s best to use rating points, after all Glicko deals with rating and ranks are there just for convenience. 100 sounds about right, maybe a bit higher, like 125.

Looks like it’s almost impossible to get deviation lower than 60. Point on the far right is GnuGo with deviation of 59 and 173831 played games. And you can see a high point of 355 deviation which is higher than starting 350, kinda weird how that would work.

Hmm - I’m not sure what Animiral was imagining, and I agree that thinking time is not really measurable especially for correspondence.

But what I had in mind was “a histogram of the distribution of ranks for all players who’s rank uncertainty is less than 2.5”.

IE only include in the histogram those players who have a rank that we are moderately sure about.

And then I thought that the opposite would also be interesting (in fact, even more interesting). I would like to see a histogram of rank distribution counting only those players who’s rank uncertaintly is > 2.5. This is “the rank distribution of players who are quite new”.

Main problem with “go game” is that it also might refer to Counter Strike: GO, Pokémon or people searching for something related the programming language go.

I’m not sure of that, it’s months after the match with Lee Sedol, which was probably the main peak of general interest in AlphaGo. To compare with searches for Lee Sedol:

I thought I could make a graph that shows how much time/game it takes to gain one rank. It would’t account for difference in effort, but still a fun little thing to do. However, ogs ranks are really spastic. Often times players gain/lose several ranks at really short time. So if someone stayed at 7k for month, and then got to 6k and 5k same day, then on the paper it would be 7k->6k takes month and 6k->5k takes almost nothing, which is counter-intuitive. I still made a graph, but I have no idea if it even makes sense. Maybe someone can think of something better.

BTW the results look intuitively right. We have this phrase “asymtotic DDK” which describes how most DDK’s graphs look. That is reflected in the increasing time to rank as you approach SDK.