Score Estimate vs Result

SouthernGoPlayer · December 17, 2023, 6:37am

At the end of this game (following a large accident) the final score is awarded as W+0.5,

Despite this the score estimate and the AI review both say that the result will be Black by around half a point. Can anybody explain this difference?

I thought I traced this back to the three remaining dame points (B gets two and W one) and the score estimate and AI being Chinese with Japanese Komi and then with 7 komi the result should be a draw. Playing out the snapback or playing the dame should not change the score under Japanese rules however, is my understanding. This doesn’t seem correct however as playing inside white’s territory reduces the score estimate by one point.

Alternatively maybe something to do with White’s internal point inside the seki?

teapoweredrobot · December 17, 2023, 9:02am

It’s this. That point in seki should have been marked as dame with since it’s Japanese rules. I’m afraid all I can do at this point is annul the game for being the wrong result.

Seki is hard for computers and humans to get right!

SouthernGoPlayer · December 17, 2023, 9:30am

Does this imply there is a flaw with the actual scoring system on OGS? It has clearly identified this corner as a seki (my S1 stones are not marked as dead) so should be able to make all territory in both groups dame at that point. This ought to be correctable on that basis, I would think.

SouthernGoPlayer · December 17, 2023, 9:33am

Annulled, or even awarded to black both seem reasonable results to me. Black had the game well in the bag apart from falling a sleep in any case.

teapoweredrobot · December 17, 2023, 9:48am

I’m not an expert on the computational bits but it’s one thing working out if stones are alive and then working out that stones determined to be alive are surrounding points that are not territory is a different thing I guess.

I think it’s a known and hard thing to work out every edge case automatically. That’s why the system is that there is an automatic suggestion of dead stones and territory but the purpose of the scoring phase is to allow the player to agree what’s dead/territory etc.
I’m not sure how to record this on the most customer services appropriate way, but if the players don’t check and just click “agree” then what can be done?!

Maybe the problem is that the automatic system is too reliable in most cases and so lulls players into a false sense of security that everything will be done automatically and correctly in all cases!

teapoweredrobot · December 17, 2023, 9:51am

I might add that my personal opinion is that there should be no automation of the scoring phase whatsoever. The players should just always mark stones and territory themselves so that there is no doubt about what players thought was dead or territory or whatever. But I know this is not popular in this age of computers doing things for us as far as possible!

GreenAsJade · December 17, 2023, 9:53am

Although wordy, this does try to be clear that the estimate is just an estimate, and it’s up to the players to agree on the score.

SouthernGoPlayer · December 17, 2023, 10:12am

Maybe I’m slightly unfamiliar with the scoring phase interface. I know its possible to change if a group is agreed alive or dead. I didn’t know how to mark a seki as zero points however. I can’t seem to just mark that point as dame in the interface now and so the only choice seems to be mark White’s seki group as dead (which is also the wrong score).

The scoring needs to mark every stone as one of alive(two eyes), alive (large eye space), dead, or alive in seki. This seems to me to be the complicated part. Since its got that right automatically it should be able to mark all the internal points of a seki group as dame, just the same way it marks the internal points of an alive group as territory.

I think the flaw is just that it presently has only one way of finding dame points (they contact both sides living groups) and doesn’t implement this second way (they are inside this seki group). But the components of this must already be implemented for scoring to work in general.

teapoweredrobot · December 17, 2023, 10:19am

You should just be able to click on points to toggle between black, white, dame while in the scoring phase. I haven’t done it in a while myself but if you click on territory, it toggles all contiguous points and if you shift+click then you can mark one point at a time if necessary.

SouthernGoPlayer · December 17, 2023, 10:52am

I read that in the interface documentation just now as well.

The score-estimator documentation here seems to imply leaving this as territory is left up to the rule set and T1 would have to be marked as dame by the rule set (or the players). I couldn’t have explained the rule here at the time this game finished however (I could now).

The intention seems to be not to mark these as dame for what ever reason, but a correct implementation of Japanese rules will need to do that for such seki points in any case.

gennan · December 17, 2023, 10:58am

I think the problem is that automatic seki detection is not as simple as it may seem, while it is needed to automatically exclude seki points from scoring under Japanese rules. So it is left to the players, assuming they know the potential pitfalls of the Japanese rules (as it would be when scoring an OTB game under Japanese rules).

Edit: KGS seems to have pretty good seki detection to correctly score games under Japanese rules (under the conditions that all neutral points are filled and dead stones are marked correctly by the players), but the creator of KGS (wms) didn’t seem to be 100% confident that his algorithm is always correct. See his 2005 challenge to come up with a seki that would be incorrectly scored on KGS: wms's Dame-Seki Challenge at Sensei's Library

If I understand correctly, in 2013 someone came up with a position that broke wms’s algorithm (winning a KGS+ membership with it):

I’m not claiming to understand why this position won the prize

hexahedron · December 17, 2023, 5:13pm

Yes, KGS had extremely high-quality seki detection as well automatically not counting points for “false” eyes and other certain kinds of necessary protections so long as players simply marked stones alive/dead correctly (and without the help of modern AI too!).

Unfortunately, I don’t know of an open source implementation of seki detection that is comparable in quality to KGS. Figuring out how to implement such an algorithm and releasing it under a sufficiently permissive open source license would probably be of benefit to the community. I’ve sorta wanted to do this in the past, but I never got around to actually studying how such an algorithm would work.

SouthernGoPlayer · December 17, 2023, 7:57pm

The difficulty of identifying every seki seems a little bit beside the point here.

The final board state which was scored incorrectly, identified these two groups in the bottom right as alive in seki. It seems to be an intentional part of the OGS score estimator (and probably the result scoring implementation as well) that it doesn’t dame eyes for these groups as the algorithm avoids specifics of the rule set being applied during this phase.

But if its going to correctly score the Japanese rule set it needs to dame any territory of either group alive in seki. Often this will not change the score as many seki both groups have the same number of eye/false eye points of the same size.

The risk of making this change (under Japanese/Korean rules) seems to be that other groups are miss-identified as seki and not made territory and that this presently occurs un-noticed. The score estimator implementation is purportedly looking for groups with the same number of liberties. But if the seki detection doesn’t over-fire then scoring the one’s it does identify correctly would seem to be an improvement as a lot of players are unfamiliar with the fine details of the counting rules.

gennan · December 17, 2023, 8:32pm

I’m confused. How can a scoring algorithm for Japanese/Korean rules dame any territory of groups in a seki, when it can’t detect seki?

Or do you mean that the OGS scoring algorithm for Japanese/Korean rules needs to have seki detection, but it doesn’t have to be perfect (or even as good as KGS’s seki detection)?

Is it? I don’t know how the (in-game) score estimator works, but it’s quite crude and inaccurate (by design) in its life and death assessments, so its estimates can be off by dozens of points.

There is also the post-game AI review that should give an accurate score assesment (because it’s run by a strong AI that is quite capable of scoring under Japanese rules, including dame in seki).

There is also the score that the players agreed on when finishing the game. This is the score that counts for the game result. If neither player made a mistake while finishing the game, the agreed score should closely match the score assessment according to the post-game AI review.

teapoweredrobot · December 17, 2023, 8:34pm

Yes

I think the answer to this is no

SouthernGoPlayer · December 17, 2023, 8:43pm

In the final board position in my game the two groups in the bottom right are both alive (we both accepted the default score). The scoring must have detected the seki to reach this conclusion, so it should be able to mark T1 as dame when it draws this conclusion.

Assuming it’s the same as the in-game score estimator, but every position I’ve used the estimate score button for on this game gave the correct score. The only incorrect score was the one used to assess the result (which get’s T1 wrong).

SouthernGoPlayer · December 17, 2023, 8:45pm

Is there another way for the bottom right groups to be both alive? Neither of us adjusted the default scoring at the end.

gennan · December 17, 2023, 8:51pm

I don’t think it did. The algorithm that premarks dead stones during scoring is not very sophisticated, as to not give players illegal hints to the status of.groups before the game finishes. I think it basically eyeballs the position and makes some educated guesses of what might be dead or alive, think 20k level tsumego. It may not even know the difference between real eyes and false eyes.

It may guess correctly in many cases, but both players should check the premarkings carefully. You can’t really trust it.

Personally, I’d be in favour to not give any premarkings at all and leave marking dead stones (and dame in seki under Japanese/Korean rules) fully to the players. (like KGS, except lacking seki detection)

Another option may be to score games fully automatically with a strong AI, without involving the players in the process. (like I assume FlyOrDie does in recent years)

As it is now, I feel OGS’s premarking assistence is somewhere in between those good solutions, often leading to confusion.

teapoweredrobot · December 17, 2023, 9:00pm

It’s not that there is another way for them to be alive, it’s that

As I’ve said, I’m no expert but it seems to be that there are two totally distinct processes:

Some crude playing out to see if stones are capturable or not
Working out what’s black territory, white territory or dame based on the stone status (is it alive or is it dead) of 1.

i.e. there’s no distinction between how or why stones are alive or dead, they are just marked as one or the other and the next step applies.

It’s not very clever which is why players must check.

Note that in not saying this is how it should be out that it is ideal! Just how I see it being now

SouthernGoPlayer · December 17, 2023, 9:06pm

Are you able to point me to the implementation of the actual scoring on github? I’ve been looking at the score-estimator project but don’t know if its used for this case. It does try to detect seki.

If the dead stone scoring is simplistic its highly successful as far as I can see as well. I won a game where it detected a groups sente life/death at counting, which we seemed to have agreed was alive as neither played another stone there. I changed the assessment to alive without fully realizing what this meant. But if my opponent had gone back to the game and played there it was dead.