Can we get an SGF database dump?

Here is a histogram over all the movenumbers (from the 46k 19x19 games) where a move on the first line was played:

First occurence of a move on the first line:
All moves on first and second line:
First occurence of a move on the first or second line:

(vertical scale not the same between diagrams)
(not rescaling the horizontal axis is a deliberate choice for better comparison between diagrams, not at all motivated by laziness)


This is not surprise to me, a lot of joseki involves 2nd, even 1st line stone for the first occurrence. And the all move peak should be when players start to play small yose. Great to see how they are visualized though, nice job.

But I am actually more curious in higher ground (higher lines, even tengen) first appearance, and their occurrences.

All moves on the 7th line or higher:

First occurence of a move on the 7th line or higher:

Not sure if this includes handicap stones actually… if free placement handicaps count as moves, that would give some extra explanation for why starpoints are over-represented in the frequency diagrams above (but I’m pretty sure that even without handicaps, humans just like to play on starpoints).


what about tengen?


Movenumbers tengen was played:
(again, handicap may be messing the first few moves up)


This might just indicate the height of the mid-game battle waged around the plateau, and the first occurrence would be about the time when mid-game fight start to spread toward the center (I suspect 6th line might be a better indication of when mid-game actually starts), disregarding some players like to play tengen first , or just play tengen to prevent mirro go. The occurrence of tengen and appearance might help to get a better picture of the player’s tendency

I am surprised that there are still move number over 250 in total games, indicating at least a few of them use Chinese rule so have to fill dame maybe?

What about the first occurrence?

first move should be black(not pass), second white(not pass). Filter out all other games.


  • with Handicap n>1, fixed placement: first move is white
  • with handicap n>0, free placement: first n moves are black
  • otherwise first move is black

Continuation is always alternating between black and white.
Fixed handicap stones were not in the API’s move tree last time I checked it.


No way!!!
Am I the only fool playing first move on 5-4???

Well, at least there’s two of us!
Actually three: me, my opponent (I was white in this game) and katago.


Yeah, I think people play the center in 9x9 so disproportionately often compared to other openings that it gets boring sometimes.

Every move in the center 3x3 of a 9x9 board is equally good, it is entirely up to your style preference, and if you like variety you can and should try all of them.

Moves outside the center 3x3 like 3-5, 3-4, 3-3 may at high levels of play be a bit worse. Of these, computer evidence suggests 3-5 is the best. One or two of these may also still be game-theoretically optimal (depending also on the rules), so in some sense could be equal rather than worse compared to moves in the center 3x3, but need more precise play thereafter to hold that result. Of course, you can try them too especially if you like to explore variety - they are still very good moves, and at human levels of play what is optimal or not may not matter nearly as much as playing in a way that suits your style.


From this nice 9x9 opening reference by @mark5000:

The 4-5 point

This changes the character of the 9x9 game into a fighting game. It creates subtle influence over the shorter edge of the board and aims at the opposite direction.


@anoek I’m updating the dataset for 2023. Could you increase my rate limit, as previously? My user-agent includes “za3k”. Thanks!

Internally I’m limiting to something reasonable in addition to whatever you do, same rate limit as before.


Whatever I did back then I did for everyone and it should still be working the same still, so you should be all set!


@za3k I’m happy you made this dataset and continue to update it.

I also have a small request for improvement concerning the file names. It is very unfortunate that the files are named after the players. OGS users have all kinds of characters in their names, many of which are not easy to type in a terminal.

I had mixed success extracting them on an NTFS filesystem, where many characters are not allowed in file names, such as the colon and angle brackets. Some names are so confusing to tar that it starts creating weird new subdirectories instead of extracting the file.

The issue could be resolved if you replaced the players’ names by their OGS user ID. This would also eliminate the potential for confusion if two users changed their names such that the same name becomes associated with two different accounts at different times.



I went to try to process some of the SGFs for computer training, and I noticed that a lot of the SGFs for handicap games have illegal moves. The reason appears to be an oversight I made when writing the conversion logic - it seems OGS free placement and fixed placement games are handled in different ways in the json and the old code didn’t account for both ways. Also some old OGS games might lack the “initial_state” field and seem to require the converter have its own logic about which board spots that the handicap stones should be placed on. I pushed fixes for these issues in GitHub - lightvector/ogstosgf: Converting raw ogs game jsons to sgfs


I was still getting some illegal moves after the handicap game fix, so I looked at it some more. The result: “oh no”.

Different games use different placements for 2 and 3 stone games!
And the OGS json from the download doesn’t always have the initial positions of the handicap stones, which is why you get illegal moves (when the script fills them in and it’s a mismatch).

I wonder what can be done. Is there a date cutoff for handicap stone placement differences?


I see a comment here about the bug, but no hint on dates:

Neither of the games you posted has this “ogs_import” field though :thinking: update: I was looking at the REST response, not the socket. The socket does include ogs_import.

One approach would be to construct OGS’s implementation (GobanCore or GoEngine) and then export that to SGF…

1 Like

Yaaaay thanks, it’s fixed. Thanks for the ogs_import clue, I found it in the json.