Autoscore is still overriding players’ skill level

Conrad_Melville · September 10, 2021, 12:16am

Acceptance of the score is completely beside the point, as I have already explained.

Feijoa · September 10, 2021, 3:48am

I think I understand what you’re saying, but it seems like you would have a stronger argument if you could clearly say “Autoscore marked these four stones dead. But under Japanese rules, they are actually alive.” Isn’t that the case?

Verrius · September 10, 2021, 6:46am

Perhaps a different example of the same thing is this game, where the autoscore showed a weakness that neither I nor my opponent had noticed:

Since they were well ahead even without the stones that autoscore chose to mark as dame, we just accepted it as it was (without correcting the dame markings); but it really does seem undesirable for the scoring mechanism to be deciding these things. White had closed off an area here and black had not chosen to attack.

Groin · September 10, 2021, 7:48am

Note that w can live with T14 and still win , It’s a matter of some yose points (5 or 6?)

dragon-devourer · September 10, 2021, 10:07am

Game was Chinese rules. But that doesn’t matter.

I don’t know whether autoscore marked them as dead but it looks that way, yes. As for whether they are actually alive, yes, absolutely they are alive. Under no ruleset are any of White’s bottom stones dead.

While it is true that black can theoretically capture some white stones along the bottom, they haven’t done so. Black has not played the tesuji. This is not like a bent 4 in the corner situation where the debate is whether the surrounded group with a bent 4 in the corner eye shape is alive with territory or dead as it stands. There, if the group is dead, then it is removed from the board before counting. Here we are talking about capturing part of a group and, in doing so, pushing up the boundary between the two players’ areas. The stone removal phase is literally just that - removal of dead stones from within the opponent’s territory - and does not include the rebuilding of boundaries.

Under Chinese rules (The Chinese Rules of Go)

So White is alive, no question.

To specifically address the question Re Japanese rules (Article 1. The game of go), here’s how it would go under Japanese rules:

Black: “They’re dead”

White: “Nope”

Black: “We should resume”

White: “I don’t want to resume. I win as it is”

Black: “Well I want to resume”

White: “OK”

(White has to resume because black wants to. But white gets to play first because it’s black who wants to resume. So white protects the cut)

White: “See, alive”

antonTobi · September 10, 2021, 11:58am

Under Japanese rules I believe that both players would lose if they found out after passing that the result would depend on who plays first (see for instance Article 13 on the linked page).

Personally I don’t like this, but the new proposal here is actually not that different from official Japanese rules (though of course, it would be perfectly possible for both sides not to notice the problem if they didn’t get the help of a stronger player).

Let’s say two Japanese professionals play, and both miss a decisive sequence when passing (extremely unlikely, but not impossible). They agree on the result, but in the post-game discussion someone else points out the problem to the players. Would the result be changed to both players lose? I guess not, since the problem was not discovered by one of the players?

dragon-devourer · September 10, 2021, 12:23pm

Good points @le_4TC. This highlights some of the subtleties of Japanese rules.

Yeah, pretty much. In the example above, if the game was closer so the sequence is decisive, white doesn’t want to resume because then black can play first and white dies. But then black doesn’t want to play first because then white can protect the cut and black loses anyway. So neither player wants to resume but they need to resume because there is a decisive disagreement, so both players lose (they should have played it out before passing). In the game as it was, the sequence is not decisive so white will happily resume, lose some points, but still win the game.

Exactly. Article 10.4:

That’s fine for Japanese pros in OTB tournaments. However, this needs some interpretation in the context of OGS due to issues with score-cheaters, sandbaggers, etc. Essentially, results cannot be changed unless foul-play is detected, in which case the game is annulled to protect the integrity of the rating system.

AdamR · September 10, 2021, 5:09pm

If the point of the thread was to convince us that these situations are not ideal, I think we would mostly all agree. However; the solution is probably not simple

To my (though admitadelly limited) knowledge, we are simply using KataGo for scoring. KataGo (again as far as I understand) does not use any logical formulas or algorithms - it does not “recognize” alive, dead, or unsettled groups in the way that we humans do. It simply plays out the game to the best of its abilities and outputs this as a result.

Thus there is no simple way - that I know of - to add an “area fill algorithm”, it’s just not how the program operates. We could probably stack up our own algorithm on top of this, as we take the KataGo output and then can further work with it, but even then such an algorithm would be much more complicated than you might think. Consider some simple situation like:

KataGo would feed W as dead, we detect there are no B stones in the territory and reverse to W alive? Obviously not. So run the check only for areas bigger than X? How much is X? In any case this would probably mean some cases of small 1st or 2nd line captures will get messed up again. So we add an algorithm to check for 2 eyes? Then we need to also somehow identify false eyes… It quickly becomes a maze of if and else statements. That should more or less work, but will immediately break on sekis again (as our old algorithm did for years), because those are pretty much dependant on being able to visualise how the scenario would play out. It would basically revert back to trying to program a completely independent algorithm that would be able to correctly judge the board position - which is what we tried all those years before KataGo was so readily avilable. Turns out it is easier said than done.

And even if we did in fact manage all that, there is still no guarantee that it is what the players thought the score would be like. “¯_(ツ)_/¯“ There is no way to precisely measure the player’s knowledge and adjust the scoring to that level. At the moment, I do not think the few cases that get messed up are worth the huge ammount of work that would go into this. Players can always fix the result for themselves if they care enough, and honestly it is a great learning opportunity to see that your are was not secure IMHO.

Nevertheless I was considering an alternative approach that might be easier and might help “solve” the issue, maybe.

It would probably be possible to analyze the two passes by KataGo and throw up a warning / force manual scoring / request resumption of the game IF the act of passing changed the score significantly - thus indicating an unsettled group (or in fact changed the winner of the game - which would be the most imporant case).
That would also potantially solve the problem of misscoring games that were simply not finished (unfinished borders and such)

I am not sure how much strain that would put on the servers, it might potentially slow down the scoring process significantly as it would have to wait for the analysis to finish first.
It might be even more confusing to beginners and annoying to experienced players…
there might be other issues I did not identify, just an idea I have been pondering about

jlt · September 10, 2021, 5:21pm

IMO AI shouldn’t be used to score a game. Otherwise it gives a hint that a group is killable, which the players may not have noticed.

The players should mark dead stones actively. Then the simple scoring algorithm determines which area is black, which one is white, and which one is neutral. If they disagree, then they may resume the game.

AdamR · September 10, 2021, 5:30pm

Well, yeah, but you need to first identify which stones are alive… then factor in sekis.

Granted, I have never been much of a coder, but I honestly would not even know where to start with such a project.
If you pardon my possible cheekiness, if anyone finds it simple (or at least doable) I believe the code is here waiting for you ~~GitHub - online-go/score-estimator: The score estimator and dead stone removal suggester used on online-go.com.~~ GitHub - online-go/goban: A JavaScript library for exploring and playing the game of Go

shinuito · September 10, 2021, 5:34pm

I think the point is how do write an algorithm (a bunch of steps) that can distinguish between and example like

and an example like

In both cases katago is telling you some stones are dead, and that’s the information you have to work with. So there has to be call what katago is dead dead except…

Maybe there’s is a simple distinction but I believe this was the point anyway.

(I think with the four point nakade white will be dead even if white moves first so pass dead in a sense, but suppose there’s a three or five or six point eye space that lives or dies depending on who moves first, take that example instead )

yebellz · September 10, 2021, 5:36pm

You dropped an arm \

It should look like this:
¯\_(ツ)_/¯

Produced by this code:
¯\\\_(ツ)\_/¯

AdamR · September 10, 2021, 6:14pm

Start with? Yes. The problem as I imagine is that you cannot end with this, that the possible situations and exceptions snowball way beyond the scope of the original (admitadelly simple) thought. That’s why we waited all the years for neural networks to finally make computer high level play possible… Simple logical algorithms might simply not be enough.

Already you are basing the starting point of the analysis on the very information you wanted to “check”… KataGo assigns the probability of the color based on the expected continuations. Even a simple case like this would immediately be scored “wrong” (or rather differently than we want). Neither of the B groups will be considered safe. And we are still dealing with the most basic scenarios imaginable. Things can get much more convoluted.

I am not sure how to word this without sounding rude (and I really don’t mean to), but simply put - I myself would not be able to code anything remotly bulletproof without dedicating months worth of time (if at all). If anyone thinks it is doable, by all means please prove me wrong (I will be happy to admit my own ignorance then), the repository is right there, but I won’t believe you until I see it in action

square.defender · September 10, 2021, 6:25pm

look at the black group above. border has no holes, but part of territory is painted, part is not. In such cases 100% of such territory should be painted in 1 color. In seki that partial paint never happens, it makes sense nowhere. So its easy to fix.

territory below has less problems, its not partially painted at least. Though these blue dots are weird, why not just get rid of this (blue/no color) distinction?

AdamR · September 10, 2021, 7:10pm

You are right, sorry about that. It seems to have all been moved to goban I think? At least that’s my best guess from the quick look. GitHub - online-go/goban: A JavaScript library for exploring and playing the game of Go

Do we? I was absent for a while and am still not up to date. That would be awesome. That was bothering me

Feijoa · September 10, 2021, 7:45pm

If we’re up for post-processing, I thought @Vsotvep’s algorithm sounded great:

Did someone find a major flaw?

Vsotvep · September 10, 2021, 7:55pm

My ideal way would be:

Nothing is marked, the players have to mark the dead stones, or resume the game if territory is still open. Ideally open borders get marked very clearly, e.g. with a big red dot for any dame point.
When the players mark their dead stones and both accept the score, the board is scored as they accepted it.
Instead of marking their dead stones themself, there is an auto-score button with uses KataGo + post-processing to determine the dead stones. It should be logged that this button is pressed, to see the difference between players cheating and KataGo making weird decisions.
When the players both don’t accept the score, and the timer runs out, or both players leave, AI scores the game automatically (this is necessary due to the large number of people not accepting score in correspondence, and relatively large number even in live games)
If only one players accepts the score and the other player leaves, this is the score that is accepted. Score cheating can be dealt with in retrospect, and in a way the opponent should’ve stayed to make sure the correct score gets accepted
When the players cannot agree, and keep clicking the board, or resuming the game, there should be an option to force auto-score by AI.

Conrad_Melville · September 12, 2021, 5:31pm

Another example that I witnessed live: Pretorian2016 vs. Kazobie. The system marked the L7 stone dead at the instant of scoring. Had it been White’s move, he could have rejected the score, restarted, and protected the cut, winning the game.

As I have said, the autoscore should not be violating site rules by giving game-play help to the players before the game is formally over.

square.defender · September 12, 2021, 6:06pm

Yes, referee(judge) exist in real tournaments. AI score is artificial moderator. Problem is not that it exists. Problem is that it gives advice BEFORE anyone called for help.

Groin · September 14, 2021, 10:32am

Full proof we need to talk if we go for autoscoring some day. What a mess.