Statistics from a 27M game sample

That would maybe be a good idea (although there would still be some “garbage” in ranked games too).

Ah, right! I was mostly interested in this question for high-level (or at least “reasonable”) games. I wonder what would be the best way to explore this with just the OGS dataset.

One could calculate game length divided by area for each game and make a histogram like above. I guess this would give a normal-ish distribution but it wouldn’t really verify or falsify the hypothesis. So better would be to plot average length against area - but we have so much more data on 9x9, 13x13 and 19x19 than the other sizes, so different datapoints would have very different confidence. But I think this plot could be interesting, I’ll probably try that at some point.

1 Like

I got curious after the last post how many games use AGA rules. Not that many, it turns out:

6 Likes

Wow, does literally no one use Ing? The distinction is probably not so relevant, since I guess OGS must not actually implement the Ing ko rules, since I doubt that anyone can implement the Ing ko rules.

This is only necessary when using territory counting with AGA rules, and it does need to be implemented as an actual additional pass, but just one additional pass stone handed over (and done only once, after all resumptions and life/death disputes have been resolved).

I’m not sure OGS AGA rules actual does it that way, and I think it just uses area counting for AGA rules to get the equivalent score without having to account for pass stones.

1 Like

This was the data I plotted:

image

Seems like Google Sheets just left Ing out from the pie chart since the slice would be < 0.1% :stuck_out_tongue:

Before pasting into Sheets, I filtered out even smaller entries to get rid of some garbage from uploaded SGF:s, here is what the full output looked like:

{'aga': 180273, 'japanese': 19531122, 'nz': 37072, 'chinese': 7120701, 'korean': 201751, 'ogs': 266, 'finn': 1, 'ing': 14192, 'aga (area)': 194, 'simple': 69, '1': 46, 'ing rules': 1, 'old chinese': 1, 'jp': 427, 'jpn': 11, 'relay game': 2, 'uchikomi': 96, 'w wins jigo': 34, 'w gives komi': 7, 'white gives ': 1, 'taiwanese (a': 1, 'free placeme': 1, 'free handica': 1, 'aga (territo': 28, '日本': 9, 'aga (fläche': 1, 'японск': 1, 'подсче': 2, 'ja': 1, '': 16, 'aga (地)': 2, '地を数え': 22, 'tang (japane': 1, 'japanische r': 4, 'china': 1, 'stone': 7, 'american': 1, 'kosimplescor': 5, 'japonesas': 2, 'japenese': 1, 'zh': 45, 'mine': 1, 'cn': 8, "ikeda's area": 1, 'egf': 1}

I also think so, but there is a “white_must_pass_last” property in the json which seems to be true whenever the rules are AGA. So maybe an extra pass is just automatically added to the end of the move list in those cases?

I don’t really care enough to dive in and figure out the details, I’ll leave that to someone else :stuck_out_tongue:

3 Likes

I wonder what % Japanese rules would have if it wasn’t set as default on OGS.

UPDATE: much better version: Statistics from a 27M game sample - #29 by stone_defender


top 10 moves: Statistics from a 27M game sample - #94 by stone_defender


repainted your diagram
odd number moves are black after all
and replaced the least popular moves with board background

first 20 moves:

14 Likes

At least as far as I can see on the website, games with AGA rules do not show white necessarily passing last or any pass stones. Example:

At least it seems to get the handicap adjustment right!

1 Like

I love this.

I love how it clearly shows the stages of the game

  1. 0-10: corner moves
  2. 10 - 30: side moves
  3. 30: 150: central moves
  4. 150: endgame

It also points out that you don’t play on the first line for the first ~150 moves of the game, which is nice for beginners to know.

2 Likes

OGS neither implents pass stones nor white pass last. The games are scored using the area method, so the outcome isn’t affected.

4 Likes

again based on images from @le_4TC :

placed the most popular black moves from move 1
placed the most popular white moves from move 2 from coordinates that were not already placed

new style moves 1-24

11 Likes

It!s all really interesting but why not use stones after the come back of the board (and a grid btw)?

I’m not so crazy lover of this QR code style.

1 Like

I wonder what will be differences between different ranks (I would assume earlier moving toward the center for stronger players), or differences in different years (3-3 earlier and more corners after the AI era?).

3 Likes

What is the widest path (within 19x19, non-handicap games)?

It would also be interesting to see this answer for particular subsets, such as over date ranges (specific years, or pre/post-AlphaGo), for different ranks, etc.

2 Likes

V3.0
made the board with grid - use it as the first layer

and transparent circles grid - use it as the last layer

So move 1 now looks like this:
1

move 2:
2

moves 3 - 60

3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19
20
20
21
21
22
22
23
23
24
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
32
32
33
33
34
34
35
35
36
36
37
37
38
38
39
39
40
40
41
41
42
42
43
43
44
44
45
45
46
46
47
47
48
48
49
49
50
50
51
51
52
52
53
53
54
54
55
55
56
56
57
57
58
58
59
59
60
60

11 Likes

Would it be possible to create a game where the “most commonly played spot” is played for each move? I’m sure there would be some weird stuff or nonsensical moves in the midgame, but still…

It would be the game.

Or even a printout of the top 3 most common moves at each…move.

4 Likes

Really cool stuff.

To make the game, it could be like this: take the most common first move, then reduce the dataset to all games where that move was played, then take the most common second move from that dataset, then reduce the dataset again to only those games where that second move was played, then take the most common third move from that dataset, etc.

Of course it would also be an idea to just see how certain openings (for example, the first 4 moves all being 4-4 point) continue.

I´m very curious how far you could get that way before coming to the point that only a single game is left in the dataset.

Hm, I know a bit of programming but probably not enough to work trough a database of millions of games.

2 Likes

You can do that on waltheri for pros though for the first 1/3rd of the game or so before the sample size becomes small.

4 Likes

Nice!

Might be a good way to get some inspiration for the opening stage of the game!

2 Likes

It’s how I studied the opening, and one of the main resources that got me to Dan.

I can’t recommend it enough.

5 Likes

I got a feeling that pros tend to agree on early openings than most (although they tend to evolve over time), and able to follow complicated joseki deep in limited variations, but amateurs would try all kinds of openings, and would diverge fast by not being familiar enough with joseki.

I also wonder what are the number of games each rank played against the number of players in each rank? Would some ranks contribute more than others in proportion?

3 Likes