People seems to employ the strategy on Foxy.
I wonder how the next AI championship is gonna look like. Would anyone submit adversary AI just to mess with opponents?
People seems to employ the strategy on Foxy.
I wonder how the next AI championship is gonna look like. Would anyone submit adversary AI just to mess with opponents?
Gonna be a mess in ratings soon
Probably a good idea. In those championships there has always been bots who played mirror go, for instance. They never win the championship, but they rarely rank last. It’s a good baseline to have. Like a guard dog. If you can’t win against this basic strategy, then you can’t win the championship.
An adversary AI that is focused on trying to create a special semeai to trick other AIs would probably lose most of its games since it’s a bad strategy in general, but it might win against other AIs who have this “blindspot”. So, it’s also a good guard dog to have in the championship.
It’s a weakness intrinsic to the architecture and the style of training but that can also be fixed by training on the affected positions. KataGo already made some good progress, bringing positions that previously needed millions of visits to correct themselves if at all, down to only needing around ~5k playouts (the KataGo used in the article hasn’t been updated to use latest yet). It’s still somewhat blind on positions where the victim has a large amount of liberties still remaining but it should also improve with training. So by the next championship it shouldn’t be all that viable a strategy IMO
I wonder if its possible to always make draw in Japanese rules by creating ko everywhere
Try it! If you discover something new like that and it works, it’s always a useful disvovery.
Leela Zero was created long ago, but scientists still research it like a brain : )
I made a set of basic tactical test positions for cyclic groups for those who are interested in trying KataGo (or other engines!) on cyclic positions.
Download here to try these positions yourself!
cycletestsgfs.zip
Example position:
Another example:
Evolution of KataGo policy on one subset of these tests over the latest training:
Harder to learn this one:
All of these winrates should be 90+% (they’re all for cycle groups that should live!) but are still all over the place. Value is harder to learn than policy, also you see the big dip in a lot of those lines in the middle as the net “over-learned” that cycle-groups should die when in fact many of them shouldn’t, maybe further training will continue reverting that over-learning:
So i am no expert but still trying to keep in touch with those AIs.
We found some inconcistancies and we search how to solve these. But these solutions look more like patches done case by case as a rethinking of the global process. Am I right?
That puts quite a new perspective on the abilities of AIs anf the trust we put in their results.
Yep.
Well, the issue actually is that AIs are learning Go by themselves or by studying human games.
Some situations, such as these cyclic positions are completely absent from human games (no human player of a decent level would let the opponent build such a situation) and weren’t made by AIs random playing either (since they need purpose, not randomness).
So, yes: that’s a blind spot. But a very difficult one to find!
Now the solution is quite simple: if we put these games in the source for training, next neural networks will be able to deal with it.
You can see it as putting patches, but it also could be seen as discovering new unknown territories in the legal positions space.
This sounds so human. When I practice tsumego, there are very often situations where I think “I don’t know if I can kill the White group, but if I can, then I’m sure the first move must be this.”
Do you have some insight on why KataGo had a weakness on those positions? So far I’ve heard two possible explanations, do you know if one of them is likely the cause?
Also, does KataGo exhibit a weakness only when it has a cycle group, or also when the opponent has a cycle group?
The lesswrong article reported before, if you haven’t check it yet?
It’s definitely not this one because whether or not the opponent has a living group or a group with many liberties inside has nothing to do with whether the group is cyclic, because the opponent having a strong group inside doesn’t require there to be a cycle.
Positions like A, B, C (and variants of A,B,C where the groups inside might also share inside liberties with the surrounding group instead of entirely filling the space inside) would all fit your description of having an “eye” filled from inside, but only A has a problem, neither B nor C ever have any issue.
So it really has to be something specifically to do with the fact that there’s a cycle, not merely surrounding things and having something on the inside because you can surround things and have insides without having a cycle. That’s also why the issue is so rare in normal games - you surround things all the time in real games where those things then do in fact live and make two eyes, or they don’t and you race against them, and they may or may not become your eye. But it’s not the surrounding that matters or that they may or may not be your eye that matters. It’s having an unbroken cycle around that thing, and that’s very rare.
Also, does KataGo exhibit a weakness only when it has a cycle group, or also when the opponent has a cycle group?
I bet you could already deduce the likely answer to this! But depending on how you’re thinking about it, maybe you need to think about it differently than right now to do so. In particular, I recommend you stop thinking of it as a weakness! A “weakness” conjures ideas of situationalness and arbitrariness, which gets in the way of the ability to reason about it. Instead you should think of it as a consistent misevaluation. This conception is better because in general if a person or agent is misevaluating something, once you know what that misevaluation is, you can easily guess the likely behavior that will flow from it in any situation (namely, that they will act in the ways they would if that belief were true).
The misevaluation is to evaluate the cyclic group as more “alive” than it should be (consistent with the hypothesis of overcounting liberties, or if the group has an eye in addition to the cycle, overcounting that eye multiple times the same way it would overcount liberties). So you can expect that no matter who has the cyclic group, the likely behavior will be to act in all the ways that would make sense if the group were strong or alive, even if it isn’t actually so. From there, you can start thinking about cases where the opponent has the cycle and what kinds of misplays can happen, every bit as well as I can.
So i am no expert but still trying to keep in touch with those AIs.
We found some inconcistancies and we search how to solve these. But these solutions look more like patches done case by case as a rethinking of the global process. Am I right?That puts quite a new perspective on the abilities of AIs anf the trust we put in their results.
Here are some food for thoughts. The below isn’t arguing in a coherent direction (some of the below ideas are opposed and argue in opposite directions!), it’s just an offering of a variety of different directions that one might approach this, and what you might compare it to.
In a lot of other applications of machine learning, thinking that ML systems are brittle has always been the standard expectation. Look no further than the recent news about our chatbot overlords and how they can be amazingly capable in some cases, and utterly fail in other tasks or misbehave. Or see many years-old news about brittleness of image classification neural nets Breaking neural networks with adversarial attacks | by Anant Jain | Towards Data Science.
Suppose a human amateur were to practice a bunch of tsumego on a concept that they had been bad at before (perhaps the J group https://senseis.xmp.net/?JGroup or big eyes in capturing races Eye versus Eye Capturing Race at Sensei's Library or something else). Then in the next several hundred games the concept never occurred in those games to be practiced (KataGo is mixing in these cycle training games at a rate measured in mere tenths of a percent). Then suppose the next game it does occur, but after such a long time the human amateur misses the concept or forgets how to apply it, because it doesn’t exactly look the way it did in the tsumego. But then you show them the exact tsumego again and they remember and solve that one just fine. Would that be an example that humans can also be unable to do “global rethinkings”, and instead have to gradually train patches to eventually become consistent to recognize all the cases?
On neural net learning - when people play a game, they learn from it by thinking about the game far beyond just the moves in the game and the fact of the literal result. They think about alternative variations, key points where mistakes happened and what the corrections should be, as well as try to retroactively think why different factors in the position led to that result. When neural nets learn from a game via AlphaZero’s algorithm, they don’t learn by thinking “about” the game. The perception of the moves that happened and the raw perception of the result they correspond to are entirety of the “thinking” involved. Does that affect how one should think about this question?
On how bots work at runtime - when querying a bot to analyze any position, if you’re not exactly copying some preexisting game or opening - then every split second of every moment, it is always perpetually like they had never seen that position before. This is a lot sharper than you might think, because “never seen that position before” also applies to them not understanding anything about their own ongoing analysis even a split second prior, within the very same ongoing query you’re asking right that moment! Like, imagine you asked a human pro to analyze a position for you by playing out some example variations on a board and giving their opinions on each result and what they would recommend is good or bad, but every 0.5 seconds the pro’s memory of any variations or prior opinions was erased and whatever moves they were trying out and giving an opinion on would always be their fresh I’ve barely-even-had-time-to-process-the-board opinion. Clearly you would be massively leaning on the pro’s split-second instinct to be accurate (and, I’ve seen pro teachers giving live game reviews, e.g. at US Go congress not always having the right instantaneous instinct in a fight and having to work it out a little, even when reviewing weak-amateur-dan games). Is “global rethinking” even possible in such a way of thinking?
What if we were to call the entire selfplay training process itself to be the ongoing global rethinking process? What exactly is global rethinking, if it’s not a gradual tracing of all the implications of one’s prior mistaken belief, as those implications get more and more distant and case-specific? (Supposing that the rethinking itself is one of the rethinkings that takes time, rather than being instant).
Depending what you mean by “case by case”, perhaps there are in fact many cases? Like, the very fact of even what local shapes count as an eye or not change when you have a cycle group (two headed dragons life), so surely a singular insight wouldn’t be enough without much more relearning of what all different shapes mean and how they affect the fight? There are also some interesting sekis that can happen depending on how big of an eye or “false” eye different groups have and what liberties are shared or not.
Depending what you mean by “case by case”, perhaps there are in fact not that many cases? For example, if you look at the graphs of the evaluations I posted above, lots of the lines are all moving together in relatively similar ways, even if their vertical positions are offset a bit from one another. Is that reflective that there is actually some amount of globalish rethinking of a few different concepts?
I see someone has succeeded to beat bots at AI-Sensei. But… Anyone has implemented it to beat bots at OGS?
fixed!
Wow, speedy!
What I find interesting about KataGo misevaluating a group with a cycle is that a group with a single eye can easily have a cycle in a graph-theoretical sense, and come to that so can a dango, but neither of these cases have been noticed to cause a problem. It seems to be only when the cycle separates the graph and both regions contain significant amounts of enemy stones (¿or are fairly big?) that KataGo does not notice.