Statistics from a 27M game sample

I love this.

I love how it clearly shows the stages of the game

  1. 0-10: corner moves
  2. 10 - 30: side moves
  3. 30: 150: central moves
  4. 150: endgame

It also points out that you don’t play on the first line for the first ~150 moves of the game, which is nice for beginners to know.

2 Likes

OGS neither implents pass stones nor white pass last. The games are scored using the area method, so the outcome isn’t affected.

4 Likes

again based on images from @le_4TC :

placed the most popular black moves from move 1
placed the most popular white moves from move 2 from coordinates that were not already placed

new style moves 1-24

11 Likes

It!s all really interesting but why not use stones after the come back of the board (and a grid btw)?

I’m not so crazy lover of this QR code style.

1 Like

I wonder what will be differences between different ranks (I would assume earlier moving toward the center for stronger players), or differences in different years (3-3 earlier and more corners after the AI era?).

3 Likes

What is the widest path (within 19x19, non-handicap games)?

It would also be interesting to see this answer for particular subsets, such as over date ranges (specific years, or pre/post-AlphaGo), for different ranks, etc.

2 Likes

V3.0
made the board with grid - use it as the first layer

and transparent circles grid - use it as the last layer

So move 1 now looks like this:
1

move 2:
2

moves 3 - 60

3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19
20
20
21
21
22
22
23
23
24
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
32
32
33
33
34
34
35
35
36
36
37
37
38
38
39
39
40
40
41
41
42
42
43
43
44
44
45
45
46
46
47
47
48
48
49
49
50
50
51
51
52
52
53
53
54
54
55
55
56
56
57
57
58
58
59
59
60
60

11 Likes

Would it be possible to create a game where the “most commonly played spot” is played for each move? I’m sure there would be some weird stuff or nonsensical moves in the midgame, but still…

It would be the game.

Or even a printout of the top 3 most common moves at each…move.

4 Likes

Really cool stuff.

To make the game, it could be like this: take the most common first move, then reduce the dataset to all games where that move was played, then take the most common second move from that dataset, then reduce the dataset again to only those games where that second move was played, then take the most common third move from that dataset, etc.

Of course it would also be an idea to just see how certain openings (for example, the first 4 moves all being 4-4 point) continue.

I´m very curious how far you could get that way before coming to the point that only a single game is left in the dataset.

Hm, I know a bit of programming but probably not enough to work trough a database of millions of games.

2 Likes

You can do that on waltheri for pros though for the first 1/3rd of the game or so before the sample size becomes small.

4 Likes

Nice!

Might be a good way to get some inspiration for the opening stage of the game!

2 Likes

It’s how I studied the opening, and one of the main resources that got me to Dan.

I can’t recommend it enough.

5 Likes

I got a feeling that pros tend to agree on early openings than most (although they tend to evolve over time), and able to follow complicated joseki deep in limited variations, but amateurs would try all kinds of openings, and would diverge fast by not being familiar enough with joseki.

I also wonder what are the number of games each rank played against the number of players in each rank? Would some ranks contribute more than others in proportion?

3 Likes

i’m quite interested in @claire_yang suggestions, just wondering if they are easy to search.

move 1 - 133
moves that play at the same coordinates that are already on the board are skipped
only 37 unique moves left

4-4 point low approach is the most popular move on almost any move in fuseki:

move 1 - 34 no skip


then it keeps playing same boring spots in the center…

so without increasing range there is nothing to generate, but increased range you seen here:
https://forums.online-go.com/t/statistics-from-a-27m-game-sample/39850/24?u=stone_defender

7 Likes

I would like to share this information with you:

When writing numbers larger than 999 in figures, there is a convention that you mark off every third column to make it easier to see the size of the number. English speakers usually do this with a comma, thus:

26,935,658

Having information is one thing; communicating it effectively, another.

Also, I don’t understand your diagrams.

To my knowledge this only applies to the US, everybody else just uses a (thin) space, if anything at all.

In 2003, the General Conference on Weights and Measures (yes, that’s a thing) decreed thusly: “Numbers may be divided in groups of three in order to facilitate reading; neither dots nor commas are ever inserted in the spaces between groups”, as stated in Resolution 7 of the 9th CGPM, 1948. Now, the US, France and GB are members of that conference, but still they all use different formatting rules for larger numbers.

PS: working in i18n is fun. Honest.

4 Likes

In East Asian languages the grouping is by myriads (10000s). Numbers there are broken up as

12 3456 7890.

Indian numerals are broken up in 3 at the tail of the number, but otherwise in groups of 2, there you’d write:

1 23 45 67 890.

So not everybody else separates by groups of 3. In fact, about half the world population doesn’t separate by groups of 3 :slight_smile:

5 Likes

In Germany, we write big numbers like this: 26.935.658 :wink:

2 Likes

Brits also use the comma between groups of three in communications aimed at native English speakers. The subject exercising the General Conference in 2003 was whether to use a decimal point to mark off decimals or the decimal comma as used by the French (and others) and the resolution adopted simply reaffirms what the 1948 (9th) convocation had resolved:

Symbol for the decimal marker

Resolution 10

The 22nd General Conference,
considering that

• a principal purpose of the International System of Units (SI) is to enable values of quantities to be expressed in a manner that can be readily understood throughout the world,
• the value of a quantity is normally expressed as a number times a unit,
• often the number in the expression of the value of a quantity contains multiple digits with an integral part and a decimal part,
• in Resolution 7 of the 9th General Conference, 1948, it is stated that “In numbers, the comma (French practice) or the dot (British practice) is used only to separate the integral part of numbers from the decimal part”,
• following a decision of the International Committee made at its 86th meeting (1997), the
International Bureau of Weights and Measures now uses the dot (point on the line) as the decimal marker in all the English language versions of its publications, including the English text of the SI Brochure (the definitive international reference on the SI), with the comma (on the line) remaining the decimal marker in all of its French language publications,
• however, some international bodies use the comma on the line as the decimal marker in their English language documents,
• furthermore, some international bodies, including some international standards organizations,
specify the decimal marker to be the comma on the line in all languages,
• the prescription of the comma on the line as the decimal marker is in many languages in conflict with the customary usage of the point on the line as the decimal marker in those languages,
• in some languages that are native to more than one country, either the point on the line or the comma on the line is used as the decimal marker depending on the country, while in some countries with more than one native language, either the point on the line or comma on the line is used depending on the language, declares that the symbol for the decimal marker shall be either the point on the line or the comma on the line, reaffirms that “Numbers may be divided in groups of three in order to facilitate reading; neither dots nor commas are ever inserted in the spaces between groups”, as stated in Resolution 7 of the 9th CGPM, 1948.

It’s clear that the 1948 resolution was to avoid the anglophone use of commas as separators in large numbers confusing/irritating francophones.

I would agree that for international communications (such as ours) using English as the medium of communication, spaces would be the best choice of separators (and point as decimal separator). Not sure how to make “thin” spaces, otoh.