OGS games dataset for reviewing bot?


#1

Hey All,

So my one of my passions in life is machine learning and I have been loving the research poured into go recently, but one thing that I haven’t seen though is a review bot.

We have seen several neural networks that perform at a mid-dan level without mcts and so are able to play in milliseconds. We have also seen in crazy stone neural networks trained to play different levels, but what I want is a bot that is able to basically give you a dan level reviewed game.

I do work with natural language bots at work but that isn’t the direction I want to go here. I envision a bot trained with a large dataset of games of players from 20kyu -dan level. In particular I want to train a bunch of networks to recognize and generate moves at a given level.

Imagine I have a networks trained at each kyu rank. Well when a 15kyu player submits a game for review I want it to recognize that out of the 200 moves in the game for these 5 moves the player played 10 kyu level moves and should be commended and these 5 moves were boneheaded 20 kyu moves. In addition I think the bot could suggest the 5 more 12 kyu moves that would have most changed the game had they been played.

Currently you can review with a bot by manually stepping through the tree but a 12 kyu player doesn’t get much benefit from the bot demonstrating cool dan level life and death sequences that they would be unable to find in their own games. If however the bot could recognize what level a move was then it could provide the moves that are just a few stones stronger and are better able to be comprehended by the player.

In order to do this though I was hoping to get a large volume of lower ranked players. I have gogod and the kgs high-dan dataset but without having a large volume of lower kyu players I can’t train to recognize the different levels of moves.

Is there some way to retrieve/slurp up the OGS dataset of games? I would ideally like the whole set so that I can accurately suggest moves that are relevant. This will all be open source and made available as I start making progress of course.


#2

So we do have an API, but I’ll ask you to hold off on crawling our dataset until Version 5 of the site comes out (which shouldn’t be too long now), we have some problems with bulk downloads in v4, but those have been resolved with v5.

It would probably be a good idea for us to prepare some bulk zips and have them available as this is a popular thing to do, but we don’t currently have that. Once v5 is live and the kinks worked out then I can revisit that.


#3

sounds awesome. I will wait till you give a heads up.


#4

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.