We used Tromp-Taylor rules, modified to remove opponent stones from within groups that can be shown to be unconditionally alive via Benson’s algorithm.
Scoring then proceeds with regular Tromp-Taylor rules.
This seems like the essence of the problem. How is it valid at all to change the rules after passing?
White wins under the “modified” rules it was supposedly playing with.
Edit: I think I was confused about how these modified rules work. Still seems strange to try to apply these rules in a situation with no unconditionally alive groups.
I think they should have the adversarial net play against katago with equal hardware time instead of arbitrary low fixed visits. Right now they are giving the adversarial net 600 playouts while katago only gets a single one. Katago’s policy isn’t even how katago selects moves in the first place, that’s normally the job of the search and value head. I don’t care how strong the policy is even supposed to be by itself (they claim top 1000 professional level), equal time with equal hardware is the only fair way to compare two different nets.
It’s frankly surprising that it got as much attention as it did, considering it is mind-numbingly boring from both the go and academic perspective.
I mean, it has a great combination of topics – the game of Go, fancy deep learning techniques (adversarial networks), authors from MIT. If we don’t go under the surface, it looks interesting.
It is also unclear which stones they remove, as “within” is not defined in the rules. If it is defined with respect to the edges of the board, there can be a pass-alive group within another pass-alive group, but it would be absurd to remove the contained group.
(Actually when I first came across Go in an encyclopedia I thought that was how it worked and played several games on that basis with fellow-pupils. To be alive was to be connected to the edge of the board)
A new version of the article in question is out!
Link: Adversarial Policies in Go - Game Viewer
First updated version:
We hardcode a defense for KataGo by making the victim not pass until it has no more legal moves outside its territory. With more training, we are able to find another attack against the victim, achieving a win rate of 99.8%. The adversary gets the victim to form a circular structure and then kills it.
Second updated version:
With 2048 visits, KataGo’s
Latest network plays at a superhuman level, but our adversary still achieves a 72.6% win rate.
Looks pretty damn weird, huh.
someone, analyze these new games with OGS Kata
Can you help upload them? I can run the level IV analysis, but I’m on my phone right now, so annoying to manage the SGF files.
Works on my home kata. It think white is winning and can play whereever even though once black retakes ko white is dead. Do I see that correctly?
2048 visits Kata vs Adversary, 6 games where Kata lost
Thanks for uploading the games. Analysis started and most are still churning through, but it looks like our analysis engine is also similarly affected by these attacks. The score estimation graphs have some stunning jumps, when the engine finally realizes (much too late) that the game is not what it seems. Or maybe it is just showing a stunning blunder on the part of the victim in some cases. Need to check the analysis more carefully to understand.
This does look like a satisfying adversarial attack. Anyone wanna try it against katago themselves? I wonder if the arrangement is very specific or a human can learn it.
OGS uses different network so obviously it’s not gonna match perfectly. But you can find the positions where it’s mistaken too.
Like here it appears to say the best move is bottom-right but once black retakes ko white can’t do anything.
Seems this ko on the inside that’s too big is the thing that does it, huh.
This game (#4 in the series above) is quite weird:
The adversary manages to create two chances to capture black’s dragon. On the first chance, the adversary blunders by playing elsewhere, but then a few moves later gets another chance to capture, which it finally takes.
On the second chance, here’s the OGS Kata analysis of the position, which does not seem to make much sense:
The AI analysis does recognize that the played move Q6 is a massive blunder, since it does not resolve the massive ko situation in the top-left to save the dragon. However, it thinks that R8 is a reasonable move, even though (I think) that should also be considered a massive blunder for letting the dragon die in ko. I guess this disparity might be explained by the analysis engine not focusing enough playouts on variations rather than the path of the game? It does not even seem to suggest any reasonable moves, like A14, to save the dragon.
Would be interesting to see what would now happen if they froze the adversary and allowed katago to learn how to counter the strategy. Seems like the strategy still revolves around giving katago a massive lead, making its value head almost useless since all moves have the exact same winrate (100%). When that happens, katago is unable to differentiate between moves and it considers a ton of candidates at once, each only getting a couple dozen playouts at best. Wonder if adjusting komi internally after every move to make the game seem even would help
Very nice, it seems the adversary is always building multiple locally dead groups inside a huge territory and then KataGo neglects the capturing race. I wonder if it would pass if it were allowed to (but I suspect it wouldn’t).
And it’s a nice example of how science works. Their previous work was flawed in some ways (few visits, ruleset discussions) and they took it to their heart and came up with something better (as far as I can tell).
Yep, this is much more interesting!
The issue is not one of capturing races at all, or ko, or anything else like that. It’s specifically cyclic-topology groups.
It’s a known issue that is common to all AlphaZero-trained bots, as far as I’m aware - I’m pretty sure convnets consistently learn the incorrect algorithm for determining whether a group is alive, one that only works on tree-topology groups and not groups that have cycles. As of a week or two ago, I’ve already been talking with the paper authors about it.
See here for the earliest prior discovery of this systematic weakness I’m aware of, it has also been reported in other places several times since then, and I’ve tested enough AZ reproductions back when it was originally discovered to think that it probably applies to all AZ agents that use convolution-based nets.
Anyways, although the weakness is not new, this is now the first time we have a way to systematically generate examples, so I think the result is very interesting and cool. And we now have a way to generate enough natural examples to try to see whether a net can be trained to understand cyclic-topology groups, which we couldn’t do before. I’ll be trying that at some point in the next months.
KataGo also estimates the score advantage, giving that additional value signal to base moves on.
Even if a bot was only looking at the relative winrates of moves, it should still avoid the massive blunders that this adversary has elicited.
The score head is not very reliable with extreme score differences either. Not to mention the search is mostly guided by the value head anyways, causing katago to consider dozens of moves on the board at once which means the visits are spread across all of those moves, weakening its play
One open question is - are these weaknesses really like the weaknesses you get in image classification neural networks, where pretty much there isn’t any good way to “defend” against the attacks, and even training on adversarial examples protects you against those exact noise patterns but leaves you just as vulnerable to slightly different but qualitatively-the-same noise patterns?
Or are these actually learnable? The fact that every attack so far is fallen into a clean conceptual category (e.g. passing, low-liberty policy priors, cyclic groups), suggests that unlike image classification, there’s actually an end to it - that you may actually be able to learn each thing one by one until there isn’t anything left (anything obvious, that is. Obviously you will always be missing things until you literally reach optimal play).
This relies on the idea that search should be able to handle things once the raw net at least has a partial handle on the particular concept that is relevant. The raw net doesn’t have to be literally correct everywhere, it just has to learn enough to have generally right idea, and then search will fill in all the gaps and cracks.