Statistics from a 27M game sample

Maybe the obvious things would be like histograms of

  • wins losses by Black or White, either split by Komi handicap etc. (probably the summary data is already in one of anoeks rating update threads though)
  • wins by score and how much, or by resignations etc.
  • lengths of games, which would have various bumps at common time settings.

Various things were done in the past with random samples (edit: my mistake) I think like rank histograms etc Unofficial OGS rank histogram 2021 - #23 by DVbS78rkR7NVe but I suppose those things work better in snapshots, not necessarily with a whole list of games. Pulling out the ranks won’t be useful unless it’s checking if the “stronger” player won etc.

I guess one could try to look at the popularity of fuseki over time. That is look for some named openings like the Sanrensei, low/high Chinese, Kobayashi, and make a histogram over time (start time of the game?) of how frequently these are played by Black or white etc. The openings I suppose are mainly named for Black though, so just checking Blacks first three moves and maybe who won the game :stuck_out_tongue: (although winning would probably not be just because of the opening :slight_smile: )

If you can think of some AI fuseki as well one could check those over time, see if they appear much earlier than expected etc.

In theory you could try to look for all kinds of funny stuff, like of the games that went to scoring, how many groups were there on the board for each color, assuming you wrote a bit of code to count the number of groups, or what is probably the same, distinct areas belonging to each player in an area sense. If you had that bit of code, you could even count the number of small life groups appearing, worth X points in area scoring say for X<10 or 15 or something. EDIT: (although maybe that’d be a hassle, basically making a scoring tool :P)

5 Likes