In this thread, I will be posting hand-crafted KataGo self-play games I’ve played on ZBaduk with New Zealand rules (7 komi). In each position, I played KataGo’s favourite move once it either was preferred to its second-favourite move by at least 0.3 pp decision and 0.3 pp winrate and had at least 12,000 playouts, or had at least 528,000 playouts (completely arbitrary values).
The latest KataGo misreads the flying knife joseki, so it undervalues 3-3 invasions. It plays B instead of A in most cases. You have to walk it out to here before the valuation starts to become accurate (and even then it’s wrong), so it generally won’t play 3-3 on its own like LeelaZero and ELF OpenGo would.
I wouldn’t know about its take on the flying-knife joseki, but I’ve added five 3-3 variations from game 1 to the demo board, some of which even include a second 3-3 variation. They range from 0.23 to 0.94 pp winrate loss compared to the game moves - not too dramatic, IMO.
What does “hand-crafted” and “on ZBaduk” mean? Are you using some web-service to provide the bot for you? Couldn’t running these self-play games be highly automated to yield a lot more than 3 games over 2 weeks?
“Couldn’t running these self-play games be highly automated to yield a lot more than 3 games over 2 weeks?”
Probably, if I had the hard- and software, but where’s the fun in that?
“What’s the aim of this study?”
Observing superhuman AI play itself, browsing through all the variations, comparing against Leela Zero (when I play with Chinese rules / 7.5 komi)… different strokes for different folks
"KataGo models a tie as being half of a win and half of a loss (this is actually configurable though!), and behaves accordingly, and the winrate will reflect this.
I’ve had one user request an explicit modeling of the probability of a tie. I never got around to doing this, unfortunately, since it would be some work and some complexity to code to track this separately from just the winrate, so it’s just folded into the winrate."
Very interesting @mark5000. I wonder how you found out…?
Somehow I was hoping that KataGo would eventually show a bias because of the shortcuts in the learning process. Just like the first AlphaGo had a bias because it was trained with human game records.
That’s just a thought. I would be very interested to hear if an explanation has been found for that issue.
I also love the idea of allowing draws in AI self-play. I am thinking that it might get us closer to perfect play (as opposed to taking a risk to reverse a game lost by 0.5pt).