OGS Pattern Search Database

Somebody already did a mass

…and Adam beat me to it :slight_smile: kudos

3 Likes

The question is who is going to build it?

I would also like to explore what’s the common josekis and openings on OGS. I always have the feeling that there are different cultures in different servers, and players just favor some over the others.

1 Like

It could be something done entirely off of OGS. Waltheri has already built up the technology to do so (is the code behind that site open source?) and the SGF dump is publicly available. It’s only matter of putting the two together. However, maybe scaling up the processing up the size of the OGS game collection is far from trivial.

2 Likes

yeah waltheri have just under 80,000 games so that’s WELL under an order of magnitude less than a potential OGS equivalent, even if we only include the 19x19 games longer than 20 moves.

1 Like

What if you limited the search to the first 100,000 results to reduce overhead?

I just want to pattern search my own games (few hundred), which is why I bumped this 7 years old feature request to bulk download one’s own games.

this might be something achievable…

I’m not sure why (by implication) the original request has to be deemed “unachievable” just because it is more ambitious than waltheris.

The fact that it would likely call for more server resources or more “whatever” doesn’t rule it out: AI analysis of our games was a big project called for more servers, and this was added because it was a good feature.

The real question then is “Why is this a compelling feature to have? And how compelling is it?”

When the latter question was asked about rengo, we found the answer was “suprisingly, this feature is compelling enough for someone to make it happen”. Similarly, an OGS Joseki Explorer was a big project calling for (a small increment) more compute, and it happened.

So we should have our minds open to this possibility…

8 Likes

If you can store 40M games (with metadata!), you can store a game tree with <40M leaf nodes.

Sure it’s resources, but it shouldn’t be any more intense than the stuff OGS is already doing.

Would take a boatloadhealthy dose of pre-processing though!

2 Likes

anoek said something about the games not just sitting around in storage to be called at any time, but that they are generated on request each time.

I don’t exactly understand what that means or how it works… But I suppose it has something to do with saving space :man_shrugging: might be a speed bump for getting something like this off the ground, especially if you want it to periodically keep up with modern games and not simply stay static at a certain point.

1 Like

The pattern search index files Kombilo generates are orders of magnitude larger than the input sgfs. In order to provide efficient pattern search your need to create far larger data structures than just the game tree itself. It also takes bloody ages to generate. So the resource demands of this feature seem pretty huge. I’d replace boatload with super tanker. Or maybe super-star-destroyer-ful.

Maybe OGS should invent a crypocurrency and doing this processing can be the proof of work computational task :slight_smile: Might as well warm the planet doing something slightly useful.

6 Likes

Who is Kombilo?

Maybe I don’t understand what you’re saying, but I don’t think this is true. An efficient structure to hold this in would just be a tree, with each node holding a coordinate and # of games that hit that branch. It’s big (40M x avg number of moves), but it’s not bigger than the SGFs themselves.

Implementation would require thought, but it’s not a computational absurdity by any stretch.

Surely accounting for mirrors and rotation would add some amount of increased file size? It surely couldn’t just be a move tree…

A go database and pattern search program: u-go.net

5 Likes

I didn’t know waltheri had competition, thanks for the link :slightly_smiling_face:

Hah, waltheri is the new kid on the block, and sure is convenient for online searching of pro games away from my home PC. But it’s a smaller database than GoGoD and doesn’t have MY games. Kombilo (or crazy man Frank de Groot’s Moyo Go Studio) is how you do offline pattern search, and against your own sgf library. Which is why I want bulk download of my OGS games because then when I am reminded “Oh, this position is like that OGS game I played 12 years ago” I could whack it in kombilo and find the game. Back when I was teaching Go I would often use my OGS games as illustrative examples of concepts for students.

3 Likes

Also, about the data structures, from Making a hash of it • Life In 19x19 we have

As a reference point, on my machine Kombilo seems to use about 2G on a 110,000-game database that occupies about 15MB

So 40m/110k * 2GB = 726 GB

Beefy

Also links to Search Algorithms at Sensei's Library

2 Likes

That’s really cool! Thanks for the tip, somehow I never heard about it until now (though I am also a relatively new kid on the block, only learning Go existed and joining OGS in May 2017)

But Kombilo (a magnificent beast) allows you to make searches of rectangles of any size within a cache built (as I infer) from 19x19 hashing.

Ohh it’s saving everything. Now I see what you’re saying.

When I read OP, I envisioned an index of whole board positions (which I think can be optimized pretty well). Pattern matching local positions is indeed a problem that shouldn’t be solved!

1 Like

Fwiw, waltheri also does nxn rectangle local pattern matching :stuck_out_tongue: