tl;dr - The AI reviews for supporters should be both faster and stronger now. If you’re not already a site supporter, you can sign up here to gain access to the full AI reviews for all of your games.
The fine folks that develop KataGo have been busy improving the engine and networks, and with the recent upgrade I finally switched us over to the 40b network, up from the 20b network. Doing this doubles the amount of computations done per move, and up until somewhat recently it was still better to spend those computations doing twice as many visits using the 20b network than it was to do half as many visits with the 40b network. However, the 40b network has better intuition, so to speak, and with the advances in the engine and somewhat recently added hardware, and some re-tuning I was able to switch us over to the 40b network and keep the playouts the same, so effectively every review now has twice the number of computations allocated to it as it did before.
As part of the re-tuning process I increased how many threads process each move. This makes reviews process faster at the expense of a slight reduction in strength - however the reduction in strength is overcome by the increase in strength of the 40b network, so the net effect is that reviews are completed faster and they are better than they were before. All plans will feel this, though it is probably most notably felt by those on the Meijin plan - you should find that your game reviews complete in a matter of a couple of minutes now, and the interactive reviews are more responsive now.
I hope you enjoy the upgrade, please let me know if you see any oddities, and a big thank you to all of our players, especially our site supporters that make OGS possible
Noticed this week that the ai gives some weird scores. On the screen shots you see that the last stone at the current score is played by the color ahead on the score, but the next move of the other color complete changes the score in favor of them.
If you could shoot me a link to the game I’d greatly appreciate it.
I’ll note too that on Monday or Tuesday we had some notable glitches with some very strange results as a result of me mixing networks and engines, but those should all be behind us. If the game is from then we can re-run the analysis to use the current configuration, if it’s more recent, then that sounds like something to look into more.
Anoek you continue to blow away any expectations I have for this website. Your continued focus on improving quality, especially in areas one might not conceive further improvements were even necessary, shows in every loving detail you make a reality. The passion of those who support your vision, having contributed their own time and expertise to bettering this website, is something I am also deeply grateful for. Their contributions are a direct reflection of the quality community that has amassed here. A thriving network that you helped make possible by building this wonderful place for us all to gather.
Messages like this one help me feel like I am a part of the process somehow. Because you take the time to explain what you are doing and why. You often include the community in your decision process, which helps me to feel like I am a part of your process. For the positive experiences in my Go life that you have seeded, for everything that you make possible supporting and enabling this community, for envisioning an effective framework others can utilize to pursue Go in a variety of ways, and for helping advance the digital presence of Go in the West, I thank you. From the bottom of my heart, know you make my life richer and brighter. And I strongly suspect I am not alone
Oh, what I was getting at is that B is losing on move 23, and KataGo should already recognize that. It shouldn’t need W to play 24 to understand it.
I think previous versions with lower threads would search correctly here and spot that without needing W to demonstrate it. Just a guess, though. Seems fishy that such an “obvious” move would change the win rate that drastically…
Ah - there’s a new review there now that looks more correct! You can see 23 correctly evaluated as a bad move now, W’s response of 24 was “obvious”, and the new review reflects that.
And I want to second (agree with fully) the FULL post by Mulsiphix1 and in particular —> you continue to [rock my WeiQi world (in a good way)]… I’m grateful to those who support your vision, and have contributed their own time and expertise to bettering this website… Messages like this one help me feel like I am a part of the process somehow. Because you take the time to explain what you are doing and why. You often include the community in your decision process, which helps me to feel like I am a part of your process. For the positive experiences in my Go life that you have seeded, for everything that you make possible supporting and enabling this community, for envisioning an effective framework others can utilize to pursue Go in a variety of ways, and for helping advance the digital presence of Go in the West, I thank you. From the bottom of my heart, know you make my life richer and brighter. And I strongly suspect Mulsiphix1 and I are not alone.
I have a suggestion for presenting the critical moves. In stead of showing win probabilities, present user with odds ratios. As the graph (and the details on the board) present exact difference in win probability percentage, it skews results towards the first mistake.
Odds ratio would give a better “score” that would allow user to compare different moves.
I also think that the odds ratios could be used in the graph but that would be a bit more difficult: We should move away from line chart to bar chart as we are looking for different spikes.
Best regards,
Jari
P.S. I’m now to site so I’m not sure if this is the right place for posting suggestions.
I’m aware of it. It’s not the same thing. Of course it’s closer what I’m looking at but it’s still lacking. In a close game point differences can be small while the effect on the win probability is large. I’m interested in identifying critical moves and I think that comparing odds ratio is the best metric (of the three discussed; there might be others).