Use KataGo for Scoring

yebellz · July 15, 2020, 3:35pm

There are two separate contexts to consider this suggestion within:

Human vs Bot games
Human vs Human games

They each have different circumstances that affect considerations, so I will discuss them separately.

Human vs Bot Games

Currently, these games are automatically scored by the system after both players pass. Both the human and the bot cannot provide any input to the life/death status of the stones. I understand that this is to prevent humans from abusing manual scoring against bots, and since the bots might not even be programmed or set up to suggest life and death status at the end of the game.

Since bot games are currently restricted to Chinese rules, the players should play out at least part of an encore to capture their opponent’s dead stones in order to avoid the system mistakenly believing that they are alive. However, often players choose not to play out the encore, maybe since they don’t know that they should, or they just find it boring to play a bunch of strategically unnecessary moves.

For these bot games, since the system already performs a fully automated life and death estimation, which occasionally causes issues, I think it is reasonable and uncontroversial to use a strong bot like KataGo instead to estimate life and death to score such games.

There is already an open GitHub issue suggesting better automated scoring for bot games:

Further, with the current system, there is some potential for humans to abuse the imperfections in the current automated scoring to get some unfair wins against bots that don’t clean up all of the dead stones. If a bot does not always capture all of the human’s dead stones, occasionally the scoring system will make errors in favor of the human, while the human can avoid such errors by cleaning up. The human player could even make various failed invasions and pointless throw-ins at the end of the game, just to give the scoring system more chances to make an error in their favor.

Human vs Human Games

Human vs human games are different since the players can and should provide input to mark dead stones for scoring. My suggestion is NOT to automate away the marking of dead stones, but rather just to use KataGo as an alternative tool for estimating life and death at the end of the game, instead of the current estimation provided by the “auto-score” button (which by the way only generates a suggestion and does not take away control from the humans). Basically, the score estimation by the bot would just be a suggestion for the life and death status, as a matter of convenience, which the humans must approve and could potentially change.

As similarly discussed earlier, I think the use of a strong bot like KataGo to assist in scoring in this manner ultimately comes down to a consideration between two priorities:

Competition purity, i.e., the desire to not have AI tools play any role in the outcome of the game.
User convenience, i.e., the desire to have some help marking dead stones and potentially detect and highlight cases of score cheating.

I think it is a valid stance to put the first item at a higher priority, but such a stance should also support removing any automated life/death suggestions, and also even remove the in-game score estimation feature. However, I think that user convenience is also a pressing concern, especially for helping beginners and potentially detecting cheaters.

@Vsotvep seemed to make the point above that the current system is an unhappy compromise that does not satisfy either priority:

Games are already potentially influenced by the life and death estimation at the end of the game (and players might choose to resume and change strategy as a result of it).
The current scoring estimation can mess up and confuse the players into accepting the wrong outcome.

I agree with their proposition that there should either be no life/death estimation at all (fully manual scoring) or the system should be improved to use a strong bot like KataGo. That way, it is a matter of choosing between the two priorities rather than falling short of satisfying either. Maybe the availability of this tool could even be a custom choice for each game (just like disabling analysis).