AI review strength increase

anoek · March 31, 2022, 3:50pm

Hi All,

tl;dr - The AI reviews for supporters should be both faster and stronger now. If you’re not already a site supporter, you can sign up here to gain access to the full AI reviews for all of your games.

The fine folks that develop KataGo have been busy improving the engine and networks, and with the recent upgrade I finally switched us over to the 40b network, up from the 20b network. Doing this doubles the amount of computations done per move, and up until somewhat recently it was still better to spend those computations doing twice as many visits using the 20b network than it was to do half as many visits with the 40b network. However, the 40b network has better intuition, so to speak, and with the advances in the engine and somewhat recently added hardware, and some re-tuning I was able to switch us over to the 40b network and keep the playouts the same, so effectively every review now has twice the number of computations allocated to it as it did before.

As part of the re-tuning process I increased how many threads process each move. This makes reviews process faster at the expense of a slight reduction in strength - however the reduction in strength is overcome by the increase in strength of the 40b network, so the net effect is that reviews are completed faster and they are better than they were before. All plans will feel this, though it is probably most notably felt by those on the Meijin plan - you should find that your game reviews complete in a matter of a couple of minutes now, and the interactive reviews are more responsive now.

I hope you enjoy the upgrade, please let me know if you see any oddities, and a big thank you to all of our players, especially our site supporters that make OGS possible

dreamtower · March 31, 2022, 4:07pm

Noticed this week that the ai gives some weird scores. On the screen shots you see that the last stone at the current score is played by the color ahead on the score, but the next move of the other color complete changes the score in favor of them.

Edit: was not allowed to submit the screenshots

anoek · March 31, 2022, 4:11pm

If you could shoot me a link to the game I’d greatly appreciate it.

I’ll note too that on Monday or Tuesday we had some notable glitches with some very strange results as a result of me mixing networks and engines, but those should all be behind us. If the game is from then we can re-run the analysis to use the current configuration, if it’s more recent, then that sounds like something to look into more.

dreamtower · March 31, 2022, 4:31pm

This is actually from today after I wrote the comment. You see move 47 it says that black is ahead, then white 48 and suddenly white is ahead.

anoek · March 31, 2022, 4:48pm

Great example, I think I know what’s going on, I’ll tune the fast analysis some more. Thanks!

sanderl · March 31, 2022, 6:34pm

Which 40b network are you using? Is it the latest from distributed training, or the one released at the time of the latest 20b?

anoek · March 31, 2022, 6:49pm

The latest from distributed training

Mulsiphix1 · March 31, 2022, 7:12pm

Anoek you continue to blow away any expectations I have for this website. Your continued focus on improving quality, especially in areas one might not conceive further improvements were even necessary, shows in every loving detail you make a reality. The passion of those who support your vision, having contributed their own time and expertise to bettering this website, is something I am also deeply grateful for. Their contributions are a direct reflection of the quality community that has amassed here. A thriving network that you helped make possible by building this wonderful place for us all to gather.

Messages like this one help me feel like I am a part of the process somehow. Because you take the time to explain what you are doing and why. You often include the community in your decision process, which helps me to feel like I am a part of your process. For the positive experiences in my Go life that you have seeded, for everything that you make possible supporting and enabling this community, for envisioning an effective framework others can utilize to pursue Go in a variety of ways, and for helping advance the digital presence of Go in the West, I thank you. From the bottom of my heart, know you make my life richer and brighter. And I strongly suspect I am not alone

rooklift · March 31, 2022, 9:41pm

I believe there’s still this issue in AI-generated SGF files:

github.com/online-go/online-go.com

SGF with AI review - two problems

opened 08:28PM - 20 Feb 22 UTC

rooklift

Two problems with these SGF files as generated by the **"SGF with AI review"** b…utton. Firstly, there's some issue where moves in the variations have a good chance of being the wrong colour, i.e. black for white and vice versa. An example can be found in the review for [this game](https://online-go.com/game/41434979) at the very first branch (i.e. closest to the start). Secondly, the variations use `AB` and `AW` properties rather than `B` and `W`. But this is a bad practice: `AB` and `AW` are for setup only, and a properly conforming SGF program will not even remove zero-liberty groups that are created this way. Moves should be made with `B` and `W` only, even in variations. These have certainly been [reported before](https://github.com/online-go/online-go.com/issues/1135) but given the lack of attention that received I suspect they must have been fixed and are now re-broken somehow, so I hope you'll forgive me for reposting.

It seems important, the wrong colors quite often get put into the variations.

bdemoss · April 1, 2022, 12:07am

KataGo seems to misevaluate this simple position, too (moves 23->24):

rooklift · April 1, 2022, 12:22am

To clarify, the point is that after White move 24, Black is still winning, but the graph says White is winning?

Strange. I can’t seem to reproduce this locally though.

trohde · April 1, 2022, 12:23am

Unless I misunderstand something (and note I’m only ~6k):

How so? What would B do after F6?

bdemoss · April 1, 2022, 12:47am

Oh, what I was getting at is that B is losing on move 23, and KataGo should already recognize that. It shouldn’t need W to play 24 to understand it.

I think previous versions with lower threads would search correctly here and spot that without needing W to demonstrate it. Just a guess, though. Seems fishy that such an “obvious” move would change the win rate that drastically…

_Sofiam · April 1, 2022, 12:47am

You didn’t see white F6

Edit: I didn’t read the last 2 posts before sending this. I’m sorry

trohde · April 1, 2022, 1:02am

OIC … oh well, I understand nothing of these things

bdemoss · April 1, 2022, 1:36am

Ah - there’s a new review there now that looks more correct! You can see 23 correctly evaluated as a bad move now, W’s response of 24 was “obvious”, and the new review reflects that.

Sighris · April 1, 2022, 3:36am

Thanks anoek and the OGS team!

And I want to second (agree with fully) the FULL post by Mulsiphix1 and in particular —> you continue to [rock my WeiQi world (in a good way)]… I’m grateful to those who support your vision, and have contributed their own time and expertise to bettering this website… Messages like this one help me feel like I am a part of the process somehow. Because you take the time to explain what you are doing and why. You often include the community in your decision process, which helps me to feel like I am a part of your process. For the positive experiences in my Go life that you have seeded, for everything that you make possible supporting and enabling this community, for envisioning an effective framework others can utilize to pursue Go in a variety of ways, and for helping advance the digital presence of Go in the West, I thank you. From the bottom of my heart, know you make my life richer and brighter. And I strongly suspect Mulsiphix1 and I are not alone.

maalla · April 1, 2022, 6:37am

Hi anoek

Thank you for the update.

I have a suggestion for presenting the critical moves. In stead of showing win probabilities, present user with odds ratios. As the graph (and the details on the board) present exact difference in win probability percentage, it skews results towards the first mistake.

Odds ratio would give a better “score” that would allow user to compare different moves.

I also think that the odds ratios could be used in the graph but that would be a bit more difficult: We should move away from line chart to bar chart as we are looking for different spikes.

Best regards,
Jari

P.S. I’m now to site so I’m not sure if this is the right place for posting suggestions.

richyfourtytwo · April 1, 2022, 6:55am

Just use points instead of win %. Or are you aware of that option and still miss something?

maalla · April 1, 2022, 7:31am

I’m aware of it. It’s not the same thing. Of course it’s closer what I’m looking at but it’s still lacking. In a close game point differences can be small while the effect on the win probability is large. I’m interested in identifying critical moves and I think that comparing odds ratio is the best metric (of the three discussed; there might be others).