Ask questions over Sensei's Library ChatGPT-style

GPT don’t care about correctness. It predicts what current character would write next. If it predicts “student:” , it will make mistakes on purpose more often than when it predicts “teacher:” . So to increase quality of chess moves you need to make it to believe that it is grandmaster. If you just ask it something and it fails, it doesn’t mean that it unable to do it. If you add “think step by step”, it will be able to solve math problems that it unable to solve without “think step by step”.

1 Like

There are a gazillion records of chess games and go games on the internet, and if they were part of GPT-4’s training dataset, he should be able to reproduce realistic moves, at least for the beginning of the game.

But chess records can easily be part of a conversation on an internet forums, since chess players talk in coordinates. We are more used to looking at diagrams than to write and read sequences of moves in text form. It’s pretty rare to see someone say “Q16 D4 Q3 D16 F17 C14 O17” to describe a go game.

So the question is, did GPT read SGF files as part of its training, or were those dismissed as “not text”?

And if GPT did read SGF files, what is a correct prompt to make GPT output realistic go moves? If asked to draw a go diagram in ASCII, it might not make a connection between SGF files and go diagrams, and so its knowledge of SGF files will not be used to its full potential.

2 Likes

And that’s actually what I did for the GPT-4 (almost) finished games on 9x9. I give very specifically designed prompts, that not only asked it to report the move, but complete list of moves, captured stones, css style game board, etc. But in comparison, I even gave it two different manual instructions that either do a full reflection or simply report the move. The result I’d say, not that much of a difference in quality. The reflection may help it not repeat an existing stone, and prolong the time when it starts to hallucinate, the reflection on the reasoning are disaster, since it will even mess up the most basic which intersections next to a stone are still free liberties (it would even list not adjacent intersections as liberties), and end up giving worse replies where those intersections might already have stones (illeagal moves). Costing even more prompt corrections to correct the flawed inferences. And the worse part is that when accumulated enough prompts, it started to forget the starting prompts and start to disobey the manual instructions in the beginning prompts, or earlier prompt corrections and repeat its incorrect inference again (and make the game almost impossible to continue at that point)

Nice experiments :slight_smile:

Why not simply create an agent that “knows” how to invoke a go engine as a tool? When the LLM receives a prompt along the lines of “play a game of Go with on a 9x9 board using Chinese rules and a komi of 5.5…” it invokes a go engine with the input, and considers the result in its response.

I actually started playing around with exactly this this weekend (implementing GitHub - hauensteina/katago-server: Katago REST API as custom tool for a Langchain agent).

I believe the training data include SGF file format webpages, so it has no trouble “interpret/recite” sgf file format. To a degree that it knows the “coordinates” in the sgf format followed the metadata are supposed to be moves

And possibly at some points some kinds of sgf game records, but the moves contents likely got phased out in the tokenization (afterall, the opening combinations are too vast in general records, probably only the first few common opening moves content were frequent enough to get preserved)

And although it can interpret the content of the sgf, but it definitely isn’t able to associated it with ASCII text go board.

Notice that even just in interpreting the coordinates in text format after move 7, it started to hallucinate, and map B[cf] to C5, W[ed] to E4, and B[ce] to C4. Even earlier than with turn by turn prompt. And it absolutely has no correlation with the ASCII board.

Why not simply create an agent that “knows” how to invoke a go engine as a tool? When the LLM receives a prompt along the lines of “play a game of Go with on a 9x9 board using Chinese rules and a komi of 5.5…” it invokes a go engine with the input, and considers the result in its response.

Certain can, and it won’t be much a difference than the Twitch play Go on twitch. When the plug-ins are open for developers to build api around, it wouldn’t be a matter.

However, it won’t be GPT-4 playing, it would the GPT-4 reciting the moves using gtp protocol from an engine. Do we really need that? (like a chatbot on twitch play go, is it not very popular at all, even if you can use a chat command to interact with an engine, but what is the point. A GUI with graphic interface can do much better in any use case. It would be like playing gnugo with added redundant interface and lag).

The issue with a LLM as an interpreter is never about how to stream content between interface (they are purely technical). You can absolutely manually input moves like you are a game engine (or just copy them from an engine), the problem is that LLM has no training on the context of these moves. You can generate as much embedding and tokens to put them into trained LLM, but they would just fill in the gap with non-sense (much like what we experience in the sensei-GPT, even if it can find the pages, it just fill-in as much as it like, since the whole language corpus inside the LLM outweighs all the meaningful combinations.

Without the game engine also output meaningful tokens, like segments of the local shapes or any indications of the actual content, the LLM wouldn’t have a clue what it means.

If GPT would be properly trained on Go, it would be dan level and would be able talk about position.But, no one except some rich Go player would start such project.

I can tell you it is very difficult to train transformer with 2D intersections’ correlations without any explicit design to parse them with convolution layers. The Othello board with such small board size already used super deep blocks to be able to “expand” the tree like representations of long correlations. Think of it like expanding a move history tree, and then have to go back to different branches (like Go variations), in order to be able to represent them linearly. We already know the existing go patterns like go positions is several hundreds of magnitude larger than the total number of atoms in the observable universe. In principle if there is no limit, it theoretically can be done, just like theoretically if there is no limit on the storage of all variations on a go board.

Without seriously modify LLM in combined with other models, they won’t work natively (at that point it won’t even be a LLM anymore). LLM is not magic, it has its own limit.

never relevant. Katago can play with 0 playouts very good. Human can play without thinking too, on pure intuition.

design may look like this:

ASCII position as input
coordinate of next move as output

and nothing except it.

entire dataset for training may look like pairs of ASCII position and coordinate of next move by Katago

at least SDK level in big enough net is inevitable

That is not my point, you remove the context of the previous sentence, the “linear extrapolation” required positions as a sequences as input, and the linear model without convolution wouldn’t know which coordinate are next to another one. And it has to waste space to find long correlations in deap blocks, to find say a linear intersection say on a 19x19 say a jump although we saw as no direction, but on a linear input in can be 2 tokens (let’s assume it can properly parse a coordinate as one token, if they separate the token into {A}, {1}, {A1}, you would need some order more tokens), or 19+19 (or whatever twice board size) tokens away. And so on. The "convolution task that only take 5x5 filter 25 operations to find a jump, would take LLM at least the board size grid numbers times tokens, and then additional combinations of tokens with varied gap tens of thousands of combinations (C{N, 2}) to find. And that is just the “first layer” of one features. With no guidance, the chance of finding local patterns longer than 10, would reach (C{N, 10}) and above, you can do the math if we need to do several blocks deep, and see how unrealistic it would get.

It has been tried. And it doesn’t work well (they would be able to mimic common starting moves but quickly devolved when moves over 10s of moves when the “common patterns” exhausted, and the prediction will go into random following). Since the number of unique board positions, aren’t enough to find long combinations (see my reply above). And sgf files are part of the training corpus in GPT models already. It wasn’t about input format, but the inefficient representations of the linear input.

I guess we don’t need it, but in the same vein, why do we need to get GPT-4 to play go when we already have go engines?

2 Likes

It certainly is a good exercise and see how easy we can integrate existing models with each other.

The experiment I did actually is just for testing prompt engineering actually. And my research has a lot to do with how to extrapolate “human subjective perceptions” out of human knowledge base. LLM is actually a good “interface” (language itself is more like a protocol), which if they have proper control vectors (tokens, or prompted with the right tokens) can be used to convey real meaning behind two minds. And modern game AI engines know how to play the way they are trained, but not how human are played, and often useless for conveying its knowledge to human beginners. If we can somehow leverage the LLM protocol for human interface with a game engine that will be truly useful. Not just showing a move, but the review and human context along with it.

And I have made some progress in modifying open sourced Go AIs to have value heads and additional trunk that not just output moves, but the sentiments associated with a move relative to local patterns that is supervised trained (it’s a slow or fast move, high or low, big or small, weak or strong, etc.) Which actually associated with in game positions. Added a bridge to a generative model like LLM, it would start to really generate meaningful output, not just some winrate and score difference, but contextual information that can be fill up. Some high level concept like if some moves combined are territorial or influence is still a big issue though, what we conceive as a wall or a group isn’t that obvious and not really easily defined, and cannot just easily adapt existing Go AIs, since the question might involve specific board local positions that are very fuzzy in its wording without any clear “borders” or “definitions”. (like if there is a review say there is an influence wall facing right on the upper left corner, which we can find in review is hard to define as global input for a game engine, and harder as output. since there is actually no direction on the input layer, and they have to output any rotations and mirroring of the board even if they are the same concept, a left facing influence on the lower right, a right facing influence on the lower left, etc.)

You might be interested in the data collection methods used in: On Move Pattern Trends
in a Large Go Games Corpus, Petr Baudiš, Josef Moudrik, 2012

(and follow-up with the same data Evaluating Go Game Records for Prediction of
Player Attributes, Josef Moudrik, Petr Baudiš, Roman Neruda, 2015
)

I actually cited them (or their upstream references for the methods) in my thesis, and the major spatial features extracted actually come from another thesis describing PACHI. (I also briefly describe how it is done in my thesis’s appendix).

Basically, they use a filter like grid indexes

Every subsequence relative moves are marked with the symmetrical grid indexes, so a local patterns can be transformed into a string. And every game positions inside a game will be scanned when every move is added like a convolution layer. The count of the string patterns will be collected, and sorted. This method only help with local patterns, but not high level whole board positions, and when the index grid size increases, the computational complexity of the method will grow exponentially. The issue is that the unique strings will grow longer and longer, with super long tail, and it is hard to know when it is a stopping point. And most local wall and local groups will exceed the 9x9 area, and which local groups are connected to another ones is not that easy to detect, since they can be loosely connected.

There are actually works about using grid and graphs to decomposing subgraphs, and they actually have a better chance of finding the “long features”. I am currently working on a mathematical models based on high dimension lattice walking for such purpose (which I might create a post later to explain what it is)

1 Like

Yes, but this was in 2012.

A lot of things have happened since then.

Has anyone ever just applied a convolutional neural network to a go board, even a pre-trained neural networks such as Katago’s, and trained it to classify the moves as territorial, influence, etc., using this data?

image



image
image

1 Like