Potential rank inflation on OGS or How To Beat KataGo With One Simple Trick (free)!

This Ars Technica article, like some other places, is misleading in that it never gives the context that the exploit only works on a specific ruleset and only works at low playouts (for now!), playouts that most experienced AI users on this forum and in the Go community would never have considered particularly reliable in the first place (less than a few 100s of playouts).

That obviously doesn’t rule out that further improvements might make it work at much higher playouts - I think that would be scientifically pretty interesting to see! I’ve been pretty supportive of the authors in the sense that I think their research itself is reasonable from an ML perspective. I will not be surprised in a few months or something they extend their methods to work at higher playouts, although I would actually hope that they pivot slightly to experiments aimed at broader understanding rather than just chasing a higher number (e.g. what other ruleset or other games entirely are vulnerable, whether you can also successfully attack the obvious ways to code a defense against it, etc). I’m definitely curious to see if there’s any further work. :slight_smile:

I admit that I’m not all that satisfied with the way the paper authors have chosen to communicate their results. Firstly that they’ve not taken extra-extra-care to emphasize in public communications and clearly highlight they’re legitimately exploiting an oversight in the net in a particular ruleset, emphasizing that it applies to this particular computer ruleset, rather than any of the rulesets that people normally think of in Go. This has led to a lot of widespread pointless misunderstanding about it where many people think their result isn’t legitimate, when in fact it is legitimate, you just have to be clear about these particular rules!

And conversely on the flip side of overselling things rather than the prior issue of accidentally underselling themselves: not clearly communicating the context that so far it only works with lower amounts of search than most would rely on in practice. KataGo capped to 64, or even 1000 playouts, is not a “world-class AI” to begin with. I also understand why they would downplay this. Indeed maybe in a few months they’ll extend it to work at higher playouts. But generally the part where you’re trying to market yourself and play up your strengths and downplay the caveats and fight for recognition/attention in any field gives me a bad taste - it’s why I initially went into industry in a role where I could much more focus on just doing the engineering, rather than going into academia.

It’s not fully the responsibility of the authors what a given news outlet ends up saying versus cutting, but given that most of the article sources directly on an interview with them, I do hope that they did try help the interviewer understand better and that it was the outlet’s issue rather than theirs.

10 Likes