Improving OGS' scoring system

gennan · May 4, 2024, 6:28pm

Why would you want to mark dead stones if the autoscore can do it correctly all the time?

Until now, the only objection I heard to having all games decided by a near-perfect autoscore after 2 passes, is that players might have missed an open border and they would be deprived of a means to fix that by resumption.

Groin · May 4, 2024, 6:30pm

I didn’t answer the poll because imho i think there should be 2 different process, one for beginners who still learn to close their boundaries (with an arbitrary limit let say 20k?) and one for other players.

square.defender · May 4, 2024, 6:30pm

not possible, "correct’ makes no sense when players pass too early.

gennan · May 4, 2024, 6:31pm

Do you mean that the system should provide additional information to beginners, on top of group status, territories and score?

Groin · May 4, 2024, 6:32pm

The system should help for lowest levels. No more later.

Like in a face to face meeting, we help beginners to finish the game when necessary.

gennan · May 4, 2024, 6:32pm

I think it is possible. The correct score may not be what those players expected, but that doesn’t make the score incorrect.

Also, what would be a better solution for such cases? Anull the game? Disable the “pass” button?

gennan · May 4, 2024, 6:34pm

Which additional positional information should the scoring system provide to beginners during the marking phase?

The system can’t just summon a human teacher whenever 2 beginners struggle to finish their game.

Groin · May 4, 2024, 6:43pm

Well there are technical limits for sure , it seems difficult to mimic the help a human could give. Like telling “I m not sure you closed everywhere” and pondering (in a pedagogic interest) when to give a bit more hints. The procedure need to balance between a reasonable involvement of the players and a lack of experience.

The sure but heavy way is to encourage the beginner to call for help by a human when the result looks without sense. With enough flexibility to not feel sanctioned by an AI.

Groin · May 4, 2024, 6:48pm

I could imagine a button “pause and call for help” alternative for “confirm the SE”.
This would put the game on pause until some help coming. Only for 20k to 25k.
For higher levels, i’d be satisfied by a scoring procedure like kgs.

gennan · May 4, 2024, 6:53pm

Providing such a button seems a question independent from the poll question which positional information the scoring system should show to the players during the marking phase.

I think you’d need a pretty smart AI to automate assistence to beginners in bite-sized chunks. I think that in the forseeable future, only human teachers are capable of doing that, and I suspect that there are not enough of those around all the time on OGS to assist all beginners who might click that button and expect to be helped within a couple of minutes.

Groin · May 4, 2024, 6:57pm

Well 90% of OGS players could help beginners if OGS rethink the communication system between both world. (Like a new global alert system and then the possibility to interfere in their game if they agree and ask)

Feijoa · May 4, 2024, 7:00pm

My estimate is that the current autoscore works 99% of the time assuming closed borders, and it should be easy to get it to 99.9% by doing anything sensible with both black to play and white to play.

But 99.99% seems tough. Don’t you get into weird scenarios at that point with Japanese rules, for example, that Katago doesn’t even understand? And when there are multiple invasions possible, a group might be surprisingly dead no matter who plays first, which would be hard to code for.

Groin · May 4, 2024, 7:05pm

Even without going that far. I remember a game where both 6d players missread a life and death in a casual game just because it looked usual shape but in fact one can fail on something looking simple. Nothing so weird as the remote corners of the japanese rules.

gennan · May 4, 2024, 7:06pm

Well, I don’t think a proper autoscore algorithm should work by assuming a resumption of the game. Unclosed border should result in a neutral area. That’s just the rules of the game. It would be incorrect scoring if an unclosed area scores points.

I think the only thing that an autoscore algorithm should do is decide which groups are dead.
From there I think it should just be floodfilling, except for eyes in seki under Japanese rules.

If it is impossible to create a highly reliable group status adjudicating algorithm, I’d vote for leaving all markings to the players.

Feijoa · May 4, 2024, 7:12pm

Right, but I’m saying deciding which groups are dead perfectly 10,000 times in a row is really hard. There can be a coincidence of two very rare situations in that many games, like maybe bent four plus unremovable ko threats? Or just two unseen weaknesses.

gennan · May 4, 2024, 7:20pm

I think KataGo is capable to adjudicate bent four plus unremovable ko threats, because that is just a situation where it doesn’t matter who gets to play first. If the players pass with that situation on the board, it should be scored as a seki.

I think the harder cases are games that aren’t properly finished, and the status of some group depends on who gets to play first. In that case the algorithm needs to be clever and do some topological boundary analysis in combination with consulting KataGo about group status.

But I think that is more or less what you created, right?

gennan · May 4, 2024, 7:28pm

Did you test your algorithm on a large random sample of OGS games, compare its results with the actual game results and verify it did better than the players in most cases where it differs?

Feijoa · May 4, 2024, 7:31pm

Not at all, I’m totally guessing about the numbers.

gennan · May 4, 2024, 7:41pm

I don’t know how many games are played daily on OGS, but I assume it is some 10s of 1000s?

I suppose it would be undesirable to have dozens of daily reports from users having a valid complaint about the autoscore algorithm incorrectly adjudicating group status, if it cannot be corrected by the players.

That’s why I stated that 99.99% correctness requirement before moving to fully automatic scoring. I think 99.9% won’t be enough for that.

gennan · May 4, 2024, 7:59pm

I took the liberty of taking this discussion out of the original topic, because it doesn’t have much to do with that anymore.

I also considered splitting off a topic about assistence for beginners in finishing their games, but there aren’t that many posts about it yet.