What is the "Strong bot based fair komi" for 19x19 using Japanese-like (for example, KataGo) rules

Sorry to revive this old thread, but this was the only citation I could find of strong bots trained with japanese rules under different Komi.

Just to check: Based on your results, the “strong-bot-based fair komi” seems to be 6 for japanese rules (as japanese-rules trained bots with 6.5 and 5.5 give different preferred colors), and similarly 7 for chinese rules.

I am interpreting your statement correctly? Do you have the initial win-percentages that for example KataGo assigns the different komi values, for the japanese-like-rules and the chinese-like-rules?


@elsantodel90 you may find this thread interesting…

1 Like

Thanks! I came across that thread. I did not read every reply there, but it has only new-zealand rules games, right? I am primarily interested in the japanese-like-rules case, as the case for the chinese-like rules komi being 7 is much stronger since much earlier.

Does anybody have KataGo installed by any chance? I tried on my linux but failed to make the binaries run. Basically I cannot find anywhere on the internet the data of win% for black in the initial empty-board position, for different values of komi (say, 4, 4.5, 5, 5.5 up to 8). I have read that KataGo has LeelaZero-like strength and has “Japanese-like” rules implementation and accepts different komi values (unlike LeelaZero), so it should be able to produce this numbers. I am very surprised that I cannot find them posted online already, they seem to be very interesting values to me, as they shoud be our “best estimation so far” of the “theoretical perfect komi” (that giving jigo for perfect play).

It’s 6 points. Under KataGo’s implementation of Japanese rules and 6 komi, the latest KataGo nets favor Black by 0.3 points. That’s less than half a point. So KataGo thinks that 6 komi is most equal integer in Japanese rules.


Thank you!

So the superhuman bots right would suggest (say, “for their own play” :slight_smile: ) komi values of 6 and 7 for japanese and chinese rules, respectively. I find it quite amazing that it actually matches the currently used 6.5 and 7.5 (with 0.5 used only as a tie breaker favouring white) values found by professional play even long before strong amateur bots appeared :open_mouth: Then again, if the general direction of big moves is about right for professional players and little mistakes from both players cancel out, one would expect a similar result.

Still, I find it very flattering to professionals, that powerful superhuman bots agree with their mortal-thoughts about komi :slight_smile:


Actually if you’re going by KataGo, 6 isn’t right. 6.5 is actually estimated as slightly more fair than 6 right now as of the tip of the current run.

Obviously a non-integer value isn’t going to be fair under theoretical perfect play, but it could easily be more fair in practice. Or in other words on 6 komi, black presumably wins enough more of the non-drawn games than white does that counting the drawn games for white actually increases the fairness of the match.

This is not surprising at all, really. We know from parity considerations, unless something “weird” happens, Japanese rules will with 50% probability each cost black one or zero points compared to Chinese rules. So until the level of play becomes so fantastically strong that bots are predicting the final parity of the game better than chance even in the opening and early midgame (they are very far from this right now), we should expect that if an integer is found to be a highly fair komi in Chinese, then that integer minus 0.5 should be highly fair in Japanese, and the more fair the former, the more fair the latter. Since 7 is very close to fair in Chinese, we should expect 6.5 to be very close to fair in Japanese.

Actuallly KataGo prefers 7 Chinese just a bit for white, therefore it also prefers 6.5 Japanese just a bit for white. But less than it prefers 6 Japanese for black.

Japanese rules black winrates for empty board top move on a very very recent 40 block network (not even released yet, but soon), with a few thousand playouts:

5.0 komi - 58.7%
5.5 komi - 55.2%
6 komi - 51.9%
6.5 komi - 48.9%
7 komi - 45.7%
7.5 komi - 42.1%


Very interesting considerations!

Of course, since we know “theoretical-komi” should be an integer, my approach was to look at the .5 percentages and see where they change: Since the bots prefer black with 5.5 komi and white with 6.5 komi, this suggests that “komi is 6” (instead of 5 or 7).

These are of course just estimates, the situation might change with perfect (or much stronger) bots, but this is probably as close as we will ever get from discovering true theoretical-komi.

Sorry to revive this old thread, but I feel that my follow-up question fits perfectly here and is just a slight variation on what has already been said, and so it is actually tidier for someone searching to find all together here.

The quoted percentages are what the bot “thinks” the win% is, when asked to evaluate the initial position. I imagine that this might actually differ in principle with the actual win% of the bot under self-playing conditions at that komi.

Of course, measuring that % is much more cpu intensive, as we might probably need around a million self-play matches to have an accurate % if numbers are really that close to 50% (the standard deviation is about 1/sqrt[n]). But do we have any measure at all, even with a thousand or ten thousand self-play games?

To clarify, I mean something like this:

  1. Make KataGo self-play lots of games with Japanese rules, 6 komi. See what percentage were actually won by black / draw / won by white.
  2. Repeat with 6.5 komi

Maybe test with 5.5 and 7 komi as well, but if indeed KataGo’s own “opinion” is right, 6 vs 6.5 will be the tipping point where it goes from below 50% to above 50%. This would be the true “KataGo fair komi for self play”, which might in principle not coincide with that which KataGo “thinks” should be, when asked by seeing its own evaluation of the empty board :slight_smile:


I guess even with all the known computing power possible, there would sitll be some uncertainty on katas estimate, and even with billion games there would be some variance on outcome. Not to even mention other bots! for example ELF thinks black is ahead when komi is 6.5!

1 Like

Errr, the self play training is this millions of games. The policy is trained to give the winrate you want.

Indeed, the policy is trained to predict the number I described. But does the net really measure it? We would be assuming that training and learning works perfectly. The experiment described actually measures it, giving in principle the actual, most-fair komi for, say, KataGo self-play, even if the network had learned totally wrong values :slight_smile: (They are probably quite accurate though)

1 Like

To expand: the network has an even heavier burden, as it is tasked not only with giving the very specific number that I describe (which would correspond to the evaluation of the empty board), but to actually learn appropriate weights that give good evaluations for ANY board.

So there is even more reason for the net to give a different evaluation for the empty board than the one the experiment yields, as it would actually make much more sense to prioritize (if one were forced to choose one or the other: probably both are fairly accurate) accuracy in many other more interesting positions, than in the very specific empty position in particular.

1 Like

Neural nets don’t have trouble memorizing things that are a large fraction of their training data. The empty board and positions with just a few corner moves are one of the most common positions in the training data. Unlike arbitrary positions where the net sees that exact position only once and then that exact position never happens again, the net sees these exact positions thousands and thousands of times, so they should have winrates that are pretty reflective of the actual proportion of wins and/or draws in the training data.

If you want to run this test yourself, feel free. It’s not impossible you’ll find a discrepancy, but I consider it likely enough that there aren’t huge surprises here so as to not be worth spending the effort to test.