Score Estimate vs Result

I have never checked what’s on GitHub, but it’s only the source code for the front-end. The backend is closed source.

I don’t know to which degree the premarking algorithm during scoring shares source code with the in-game score estimator, nor do I know which parts of their source code live in the back-end or the front-end.

Note that this case also illustrates a problem of premarking stones during scoring: the premarking basically gave a hint, which is technically illegal outside assistence. That’s why I think it’s better to drop it altogether. It gives hints (and those hints can even be wrong).

1 Like

I did find this having a quick examination of this topic. Even if this is a bit much then GNU Go (which is open source) purportedly has an implementation which might be suitable for scoring purposes.

I agree identifying the life and death status of the groups is the issue with automatic scoring by default. But having done that getting the dame in seki right seems more proper and not changing the life and death hints already given.

I am pretty sure that autoscore just uses the strong AI score estimator available to spectators, marking all the stones dead or alive based on KataGo’s ownership estimate. So it’s actually extremely smart and surely understands on some deep level that those stones are in seki. However, that information is not available to the front end code, which is just told that the stones are alive and makes no attempt to distinguish seki from normal unfilled dame.

In fact, the Japanese rules don’t even distinguish seki from unfilled dame, right? So if we tried to follow the rules more precisely, there would not be any territory in the upper-right, either:

image

Does KGS require filling dame before scoring?

I agree that it would be better to have seki detection during scoring on OGS. That way, players would only need to mark dead stones and there would be no need to unmark dame in seki under Japanese rules.

As far as I remember, it usually works well without filling all dame. It even detects many teire points, so even when you pass before filling all the dame to force an unwilling opponent to defend a teire, chances are it still doesn’t mark those tiere as territory (and this algorithm has existed long before AI).

Also, from my previous post about KGS’s seki detection, it looks like it even understands hanezeki.

3 Likes

I spend the last week coding and trying my hand at such an algorithm. :slight_smile:

6 Likes

Had another such seki, where the OGS scoring miss-scored the seki. Unfortunately I forgot about this and didn’t correct it. Fortunately the result doesn’t change due to this one.

Your example program got the seki correct for this case, but stuffed up a bunch of other life and death cases so the score was way off.

2 Likes

What do you mean “stuffed up a bunch of other life and death cases”?

(Edit: did you forget to mark dead stones? You can click on them to mark them.

The algorithm I implemented explicitly does NOT attempt to determine life and death. It requires that someone mark the dead stones. This makes it much more flexible - for example it could be used for Go software or servers that aren’t backed by AI. The markings could also be supplied by AI if you wanted to do that, but it doesn’t have to. The demo page is testing just the scoring logic, and it’s running entirely locally within your browser so it doesn’t have access to any AI.)

5 Likes

In two groups there were some clearly dead opponent stones contained in a large eye. The scoring treated these two areas as un-decided. Not sure if you collect the SGF files uploaded to the site, but you should be able to get that from my OGS game link otherwise (if you want to re-produce it). OGS automatic scoring only got the T19 point in the seki wrong.

Reading your more full post, I can see what expectations should be regarding the scoring of that system. I made some assumptions about how it would be determining seki via playouts, which is also used to decide life and death. I guess your using a different approach for seki detection however.

Yep, you need to click on the stones that are dead to mark that they are dead. The demo page is NOT supposed to be testing a full automated scoring. It’s testing only the part: “Given correct markings for all the living/dead stones, detect sekis and compute the correct score”. Distinguishing which living groups are seki or not if they’re known already to be alive turns out to be a vastly, easier problem than determining in general what groups are alive or not in the first place, and can be solved for 99+% of cases with a deterministic algorithm that traces groups and looks at the properties of their eyespaces.

If you wanted to turn it into a full automated scoring, you would take something like OGS’s marking of dead stones (or better yet, Feijoa’s algorithm which does a much better job respecting the players’ borders when newer players omit some defenses), and then use this algorithm for the final scoring given those dead stones rather than OGS’s current final scoring.

4 Likes

Interesting. I’m going to summarize the algorithm as filling the free points inside and around a group with less than two eyes. When it hits a live stone of the other color then this makes it seki.

I did find one problem case. Take the example from sensei’s Library called ‘Seki with partly filled eye space.’

The scoring marks this correctly when it is a seki, however I can also mark the internal three stone group as dead. If you want a more correct example the three stone group could be expanded to four stones in which case Black’s group is alive with points. I could not however get this group to be marked as territory, regardless of changing the single eye and marking the internal group as dead. Splitting this into two eyes gets this black group to score territory however, even if these eyes are filled. This will be down to the Black group’s external liberties which eventually reach the surrounding White group.

Maybe there is an adjustment which separately determines for the internal and external liberties of the Black group which are scored.

Edit: Scratch that suggestion, I think that will go back awarding all seki empty eye’s with points.

2 Likes

I think this can be fixed in the scoring with an additional special case. As well as a group with two eyes being alive without seki we can also score territory for a group which is alive because it wins an internal capturing race resulting in a living eye shape.

To detect this you identify all the one eye groups where that eye contains a dead group having a living shape. This could probably be identified from the eye shape rather than the shape of any contained group. The scoring seems to work if the dead group in the eye is broken up in some way. Unfortunately this involves some complexity to identify if the shape is dead or alive in the single eye, though ultimately its still relying on the group in the eye being correctly marked as dead or alive to begin with.

There might also be a simpler rule, also relying on life and death of groups being correct. I think the eye scores territory if it contains more than one dead stones. Of course if there are some dead shapes marked incorrectly as dead then this will get the score wrong but scoring is already relying on group life markings being correct to begin with. Have not found any counter cases where two or more dead stones are captured inside a seki without disturbing the seki.

2 Likes

Thanks for finding this case! I pushed a change to Territory Scoring Test that should fix it.

It’s very much cases like the one that you found that I’m concerned with getting to behave intuitively as much as possible, rather than the more bizarre cases that people would consider “rules beasts” such as all the weird examples in the Japanese official rules commentary. I find the “weird” cases are often easy to handle algorithmically precisely because we can assume life and death is marked for us. It’s the ones where the position is incomplete, or not all the dame are filled, and/or where one has to make a judgment call as to how much work the algorithm tries to do in those cases versus leaving it to the players, that it becomes much harder.

Anyways, you can see the new behavior here in the screenshot:

The case you found with a straight 6 being filled with a straight 4 marked dead (also the upper right corner in the screenshot), is an interesting one because black does actually owe a move! Two moves, even. Unless black actually captures the “dead” stones before the dame are filled, the position will become a seki, so black’s final territory should be two points smaller than it would if we simply just counted the region as territory. But if it’s not played out, and the stones are marked dead, given the choice between scoring it as territory and not scoring it, scoring it is clearly the more intuitive choice, and it would be up to white to fill the dame before the end of the game to force black to make the 2 additional moves.

There are cases where two stones can be captured without disturbing a seki and where the player has no ability to fill dame to force the opponent to make the capture, and if you relax the requirement that all the fillable dame are filled you can construct where more stones can be captured without disturbing a seki (but the ones I can find, filling the dame will force things along and/or the players actually want do to continue play rather than stop).

I kind of also want to make the scoring algorithm “AI-resistant”, in the sense that if there are sekis where there are “dead” stones that are freely capturable, then even though under Japanese rules technically those stones should be marked as alive (Japanese rules to my understanding require you to physically capture such stones in sekis before ending play to gain those points), in practice I would expect many sources of automated marking such as from AIs would mark them as dead in the expectation that they will be captured even if the players don’t actually do so. This means that it’s desirable for the scoring algorithm to avoid counting territory in those cases of technically-wrong marks.

Due to general nervousness about such cases, for now I adjusted it to only count territory in these cases when it thinks the capture of the marked dead stones would leave a living shape.

Let me know if you find any more weird cases, it’s really helpful.

4 Likes

I couldn’t find any more problem cases any more. I agree about the two point loss, but the onus is on the player to fill the dame when the opponent loses points so awarding the 6+ points territory seems fully correct.

I was wondering however if even a dead eye shape should be awarded territory when its marked as dead. This appears to be what the players agreed with the life and death status, even if they incorrectly concluded that the group in the eye could be captured. On the other hand the AI or the players should correctly mark a seki as alive in those cases when the points should not be awarded.

I’m also interested in the cases when 2/3 stones can be captured and a seki remains, as I don’t know of any. Was thinking something like a thousand year ko, but this should be resolved to seki before counting anyway.

There are also examples like this,

but this appears to be a problem during play, not in counting itself.

Its my own expectation that any AI would never mark (say) 3 stones in the only eye as dead just because of external liberties. It certainly wouldn’t move to capture them during a game as this would be a massive flaw with its play and its the same algorithms marking groups as calculating out continuations.

I’d agree with this.
But there may be one minor issue with the recently added special OGS anti-stalling rule that a player can (under specific circumstances) force the game to end and have it autoscored by passing 3 times in a row.

In this case of the upper right situation, when white tries to force black to capture those 4 stones and black passes 3 times in a row to trigger the anti-stalling rule, I wonder how the autoscore would evaluate black’s territory.

1 Like

If my suggestion of when triggering anti-stalling was acceptable was used, this would be reportable