Potential rank inflation on OGS or How To Beat KataGo With One Simple Trick (free)!

This makes me wonder if it has to do with cycles in the boundaries, rather than cycles in the stones: the problematic groups have two long cyclic boundaries.

I would guess a single point eye or a small eye is easy to “perceive” locally via the cooperation of at most a few layers, so the neural net will have an easy time implementing a hack for that. Imagine you were looking at the board with a magnifying glass super zoomed in unable to see most of the board at a time, instead having to manually scroll your viewpoint around. It would still be easy to perceive small cycles (eyes). But sufficiently large enough cycles would be hard to perceive. It would look locally at any given spot exactly like like any normal non-cyclic dragon that is bordered by opponent’s stones. So there is an expectation that for a sufficiently large cycle, the neural net’s algorithm will probably behave locally about the same as if it wasn’t a cycle.

And so if that algorithm is "locally this group has a branch that extends north and is reporting 8 liberties so far from that direction, and it has a branch that extends south that is also reporting 8 liberties from that direction, therefore I should add 8+8 and behave as if I have 16 liberties, that’s usually fine, unless those 8 liberties from each branch are actually the same 8 liberties double-counted because the two branches connect back around to each other in a loop way outside the radius of easy perception.

8 Likes

And here is it: the rank inflation I warned you about. Contained, for now, as it is an experiment, but for how long…?

Reddit: I guess I have mastered the AI attack : baduk

User, who mastered the approach: fallingsnow88

2 Likes

and no one is going to fix Leela Zero or ELF

Pro accounts on OGS can play ranked games with normal user. Then rank of normal user changes, but rank of pro account does not. Same should be done to bots. Rank of human should not change if they play bot. So we at least will have accurate human ranks.

For years already exploits are known for many bots. It’s just that KataGo seemed to be more immune against it until recently.
I suppose that experimenting to confirm some exploit is acceptable, especially when those games are unranked. However, using some exploit over and over in ranked games comes down to rank manipulation, which is not allowed on OGS.

3 Likes

IIRC the argument has been made before that ranked games against bots serve a purpose: they help new players on OGS to get a rank, because many established players avoid players with a provisional rank.
So maybe this proposal of making bot games unranked should only apply to players who have an established rank (this proposal and the amendment has also come up before IIRC).

7 Likes

The one-eye high liberty case continues to be the hardest to learn when the group is large. Value function very gradually changing in that case (depending on the size of the group, larger is harder to “perceive”), able now to detect this move as maybe letting black win, when a few weeks ago it wouldn’t consider black to have a large chance to win even after playing this cut.

Cutting here is not really part of the policy yet, very tiny %, so this move wouldn’t be seen if deeper in the search, it needs to be learned now that the value function is enough to decide it as good. So continues the back and forth between policy and value learning, each one enabling the other to progress.

10 Likes

very long circle

long1

long2

long3

6 Likes

making (ranked games of human vs bot) (ranked for bot or provisional human) but (unranked for human with an established rank) would be good
currently there are humans that play only with bots and believe their rank after that. I guess (sdk vs bots) may be actually (ddk vs humans). Hopefully more people would try to play with people with such limitation.

While I agree with you that it is good to encourage more human-human games, if someone wants to play ranked games against bots, is that really such a bad thing? After all if they play honest games, why shouldn’t it count as ranked?

It feels like a needless limiting of options to me, and in contradiction to the OGS cockpit design ethos.

2 Likes

Massive change in the eval of the latest b18 network (s5703M) compared to any prior nets in some positions.

Previous net:
image

New net:
image

Also true of the raw winrate of the net without search. Previous net was 0.1% for white win, new one is variously 13-22% for white win depending on random symmetry.

5 Likes

You answered yourself already.

Because i want to encourage human vs human
Because honesty shouldn’t be a parameter of a ranking system.

white is so ahead that level IV review gone crazy

image


image

5 Likes

Pretty cool talk

2 Likes

Any recent katago game /:

4 Likes

Nice improvement for the latest b18 (S5865M), score prediction is more realistic and the raw net now gives white 50-70% winrate depending on symmetry.
KG B18

I updated katago-micro (katago-micro) with a recent b18 net that has a basic idea, some of the time, about wanting to fight for escape or for eyes for the cyclic group, I only know of one player who’s been trying games regularly against it now and still wins once per several attempts, but I would be curious to know how easy other people find it now.

Especially if you can find ways to win in different ways! Different patterns in how to get it to form the cycle, or bigger or smaller interior groups with different eyeshapes. For example, almost all attempts in the last few weeks are by players who only ever tried to make sure the interior group independently got 2 eyes, but it used to also be the case that players would get by with simply making the inner group not alive but have a large eyespace (square 4, or bulky 5) and lots of shared liberties with the cycle, and then win the race via big-eye-beats-small-eye (Big Eye Wins Semeai at Sensei's Library). Some test cases suggests that there may still be huge evaluation holes in big eye vs cycle capturing races but it seems like nobody has tried recently - can anyone still make that, or other different kinds of variations on the attack to be successful?

As another example, some test cases suggest that the larger the interior group in the cycle, the harder for the bot to recognize the danger. I’m not sure I’ve seen people specifically try to make it larger though, maybe this or other variations would lead to success even more than once per several attempts.

If you give it a try, feel free to link it here! It might be fun to see how often and in how many ways you can make it work now that katago-micro puts up a small bit of resistance sometimes, and over the next few weeks as we update the b18 net more. :slight_smile:

12 Likes

Nick Sibicky just published a video on the exploit: NSGL #508 - How to Beat Superhuman AIs - YouTube

No mentions of this thread, sadly :crying_cat_face:

2 Likes

Left a comment in that video. :smiley:
The newest 18b net has improved to where I’m starting to get genuinely interested in seeing more people try and report practical attempts at the exploit on the newest model and what sub-methods continue to work most easily. I updated katago-micro again, the most recent net s5926 is another step up (after s5865 was a step down and s5896 was a step back to parity). I’d also be interested to see how it plays out in practice as people who run higher playout versions (like the one @JBX2010 runs) eventually update to the newest nets too.

12 Likes