Use AI scoring to eliminate score cheating

pwsiegel · December 27, 2023, 9:27pm

“Score cheating” is a form of trolling in which one player intentionally mismarks the status of one or more stones during the scoring phase in order to trick the other player into accepting a loss. See here for a recent example: link.

The victim has no recourse other than to call a moderator, but the moderating team is often not staffed to respond to these issues in real time, so in practice all they can do is take corrective action after the fact.

This post is a proposal for preventive measures, with the goal of mitigating the threat of score cheating and reducing the burden on the moderators.

During game setup, add a checkbox called “Use AI scoring”, or something similar. Add an indicator to the game list so that players know whether or not the setting is being used before accepting a game.
When the game ends, players are presented with the same view as in the status quo: the game board with the results of autoscoring presented, as well as the options to accept the score or abort and continue the game. Crucially, they are NOT given the opportunity to change how any groups or territory are marked.
As in the status quo, a timer runs down during the scoring phase. If the timer runs out before either both players have accepted the autoscoring or one player has resumed the game, autoscoring is used by default.

In rare cases the autoscoring tool will get the score wrong one way or the other, and the aggrieved player will have no recourse but to make their case to a moderator. This is superior to the status quo in which a troll can win games by stubbornly refusing to accept the score of a game until their victim gives up and leaves, because all of the troll’s games require moderator attention versus the comparatively rare cases where the autoscoring tool gives incorrect results. It also provides the site’s developers with the ability to reduce the burden on moderators by improving the autoscoring tool over time, whereas in the status quo there are no software solutions for preventing trolling.

Groin · December 27, 2023, 9:39pm

1k? I would be more in the mood to ban at this level.

Note I feel still a bit rude to denounce publicly even if it’s cheating, you could have omit the pic or at least hidden the names.

Probably you were victim of internet lag which can lead to confusion when accepting a scoring. This step could be improved to give more security at first.
In a recent thread i proposed an extra second confirmation screen, with the final position the server got from both players to be sure and clear.

And I am not very enthusiastic to change an evil for another one as wrong AI estimation are awful even if rare.

pwsiegel · December 27, 2023, 10:13pm

Note I feel still a bit rude to denounce publicly even if it’s cheating, you could have omit the pic or at least hidden the names.

Probably you were victim of internet lag which can lead to confusion when accepting a scoring.

I felt comfortable citing this example explicitly because the player is a well-known troll that has engaged in the same pattern of behavior over and over again, often creating new accounts when old accounts are banned. If you click through their games you will see other rule-breaking behaviors, such as sandbagging (intentionally losing to bots in rated games to lower rating) and verbal abuse. So no, this was not due to internet lag.

And I am not very enthusiastic to change an evil for another one as wrong AI estimation are awful even if rare.

I am proposing an optional change. If you would prefer the risk of being trolled to the risk of AI scoring errors, you would not be required to play games with the setting enabled. I would prefer the latter risk, so I would enable the setting.

Conrad_Melville · December 27, 2023, 10:16pm

Sorry, but this isn’t true. Due to the autoscore bug, which I have been talking about in various threads for the last three years, the autoscore fails to mark obviously dead stones with alarming frequency, which means the adjacent territory also turns to unmarked status. Sometimes the autoscore fails to mark fully enclosed territory with no dead stones in it; this was the original form of the bug, which still occurs with less frequency, and it showed that the bug was distinct from someone score cheating. Players cannot distinguish between the bug and a cheater, which requires consultation of the game log by a mod.

I have slowly come around to supporting the idea that the game should be frozen, awaiting a mod decision, when there is a dispute. The drawback to that is the potential for trolling.

Groin · December 27, 2023, 10:19pm

I said internet lag because you accepted the wrong scoring so i guess unintentionally.
Maybe i am wrong, it’s just my guess that you even didn’t call for moderation but better engaged in resistance which is a dangerous game (a feeling from the ingame chat). Now i do understand you don’t appreciate this player attitude. Ofc.

pwsiegel · December 27, 2023, 10:20pm

I have encountered cases where the autoscore tool makes mistakes, but I have always been able to address them by resuming the game and filling dame. If there are more severe bugs, I don’t think I’ve encountered them - do you have any examples handy?

In any event, I’d imagine that the autoscore bug is fixable, if that’s currently the only blocker to adopting the proposal. But it would be unreasonable to expect any system to make no errors, so the stakes of the discussion are how to most efficiently use precious moderator time.

pwsiegel · December 27, 2023, 10:24pm

No, I did not accept the score. I made a few attempts to correct, and then moved on when I realized it was a troll. Eventually the scoring system defaults to the currently set score if one player disappears, a feature which works in favor of the trolls. I am proposing that the scoring system should default to a neutral judgement, i.e. the AI score, instead.

Yes, I did call for moderation during the game, but they were not able to respond until well after the game was over.

Groin · December 27, 2023, 10:28pm

OGS is trying to improve moderation, i can testimony on this.
Meanwhile I’m more on @Conrad_Melville idea to freeze disrupted games until a mod can fix.

pwsiegel · December 27, 2023, 10:36pm

I think the OGS moderator team is very strong overall, given the complexity of the job and the fact that they work on a volunteer basis. But a moderation-based approach to this problem simply does not scale well as the number of active users grows, and @Conrad_Melville’s proposal unfortunately would make this worse.

More popular sites like Tygem and Fox rely on automation to solve these sorts of problems out of necessity, and most likely OGS is going to have to do the same eventually. OGS’ adoption of scoring automation as an anti-stalling measure shows that the folks who run the site have already acknowledged this.

So unless somebody can cite a concrete reason why correcting auto-scoring errors is a less efficient use of moderator time than dealing with disputes involving trolls, I think OGS should adopt more scoring automation sooner rather than later.

Groin · December 27, 2023, 10:40pm

There are other popular servers more similar to OGS as Fox or Tygem by the size who don’t use yet AI autoscoring

Conrad_Melville · December 27, 2023, 10:53pm

Please don’t put words in my mouth. The idea of freezing a disputed game goes back for years, and it was not my idea. I was opposed to it in the past precisely because I feared that the button might be trolled. However, after further thought and observation over the years, I have, as I said, come around to favoring it.

Actual trolls, BTW, in the sense of dedicated, persistent saboteurs and vandals, are not very common.

Here is an example of the autoscore bug: Freundschaftsspiel

Normally, I wouldn’t know whether this was the bug or score cheating, but a moderator explicitly told me that it was the bug, because the game log showed no cheating. In any case, this is what the bug LOOKS like, so sometimes when people think they have been cheated, they have, and sometimes it is only the bug. And as I said, this happens a lot.

Groin · December 27, 2023, 11:00pm

Maybe that was mine (no pun intended)

hoctaph · December 27, 2023, 11:06pm

pwsiegel, did the Accept predicted winner button not come up?

Anti-escaping and Anti-stalling features

pwsiegel · December 27, 2023, 11:19pm

Do you have a sense of how common the error is? Curiously, the score estimation algorithm gets the score correct, reinforcing my optimism that the bug is fixable.

Actual trolls, BTW, in the sense of dedicated, persistent saboteurs and vandals, are not very common.

I disagree. Consider the troll in the game that I linked to in my original post, for example. I have encountered this troll at least once per month over the past year or so, and sometimes several times in the same week (all under different accounts, since their accounts tend to get banned quickly). That’s just one troll - when you add in the others, I’d estimate that I encounter at least 2 or 3 examples of deliberate score cheating every month. I certainly hope that other players don’t get targeted as often as I do, but when it happens I usually check the troll’s game history, and I know I’m not alone.

Meanwhile, as above, I don’t recall a single example in my thousands of games when the autoscore algorithm made an error that could not be fixed by playing a few additional moves. In the example you gave, white could easily get the correct score simply by capturing the two black stones if area scoring is used. (If my proposed autoscoring feature were available, I would probably only enable it with rulesets that use area scoring to be safe.)

Groin · December 27, 2023, 11:27pm

One thing coming to mind is that autoscoring works better at your level. Less at the opposite side of the rating.
With unfinished boundaries for example or with some life and death status it should be the assesment of the players which should prevail, not the AI. And let say players agree they missed to close a boundary but at the same time they discover (by AI prescoring) they were wrong on some L&D, how will they resume playing to get a final scoring? Shouldn’t AI interfere in a game? I won’t compare the size of maiden originating from a couple of bad spirited people with failures from weak players to produce a finished countable game by an AI.

There are already long debated threads around your proposal and problems with AI use, maybe i let you have a check there first before going further in the discussion ?

Conrad_Melville · December 28, 2023, 12:37am

The autoscore bug is more common than score-cheating trolls, but less common than the entire universe of score cheaters (which includes beginners who don’t know how to score and frustrated losers who resort to cheating; about half of these people reform when called to account, so I do not consider them trolls).

You are right that the score estimator appears not to be subject to the bug, an important clue IMHO.

You and I may have different definitions of trolls. In the 2.5 years when I moderated, I encountered perhaps two dozen trolls, of whom only five or six were score cheats. I was in many “ban wars” with trolls, including the guy you encountered (I recognize his MO in his game history), in which I banned 50 or 60 new accounts in the course of 90 minutes, so I know something about this subject.

At least 2/3 of score-cheated games don’t get reported, I think, and some mods think the percentage is even higher. However, your rate of encounters with that particular troll seems high, suggesting to me that he is targeting you. Also, you may not have seen the bug because you have a higher percentage of wins and losses by resignation than, say, DDK players.

Yes, people can and should correct the board in the stone removal phase, but some don’t notice the mistake (as in the case I cited), and DDKs often don’t know how to correct the board or even that they are allowed to do so. Resuming play and capturing the stones is, of course, not an option in a close game under Japanese rules.

I play few games online, yet I have seen the autoscore bug three times in rengo. The board was corrected in two of those games and left uncorrected because it didn’t matter in one game. Similarly, because I watch a lot of games, I have seen it pop up in countless games and get corrected, so there is no record of it in the game history.

Feijoa · December 28, 2023, 2:21am

I believe it happens more often in rengo because of something about the order of turns; maybe the player who did the final pass is not allowed to trigger autoscore and it waits for one of the next players to visit the game.

Conrad_Melville · December 28, 2023, 2:27am

No, I think rengo has nothing to do with it. It has been happening for 3.5 years, long before rengo was enabled. Moreover, I see it in plenty of games regularly. it just happens that I have been playing mostly rengo in the past year.

One of these days, if I feel ambitious, I will start a thread on the bug, lay out the full history, and post the 20+ games I have not bothered posting before. I have posted others already in the bad-scoring thread

benjito · December 28, 2023, 2:57am

I’m not really interested in YetAnotherOption™ but I think this part is interesting:

Could autoscore just be applied to all games where two players cannot come to an agreement?

fuseki3 · December 28, 2023, 3:18am

If this is the case, is it not possible to use a more accurate version, e.g. of KataGo, to check the score when force-autoscoring, if the autoscoring itself contains bugs?

For example, the score displayed in IA analysis at the end of games which are completely finished always seems very accurate.

Apart from the autoscoring suggestion –

Maybe this should be in a tutorial or FAQ given how common it is, and given that it can lead to scoring disputes when both players haven’t finished closing all of the territories’ boundaries.

(perhaps both in a tutorial section for beginners such as “finishing and scoring the game”, and in an FAQ section related to scoring/cheating issues)