Use AI scoring to eliminate score cheating

benjito · December 28, 2023, 3:22am

I think it would have to be. Autoscore already uses KataGo to mark dead groups, but that fails badly when there are open or weak boundaries - I’d be in favor of using Kata’s actual score and heat map instead.

Feijoa · December 28, 2023, 5:13am

This is my proposal for a slightly smarter algorithm that works well even when there are some vulnerable groups or unclosed boundaries:

https://pdg137.github.io/autoscore/v3.html?game=60105020

IMHO using a KataGo score estimate directly would make the scoring even more surprising and mysterious in such cases.

Conrad_Melville · December 28, 2023, 6:17am

Among stronger players who are honest, the autoscore bug errors are corrected as a matter of course, so what you are seeing is a cleaned-up version. How many DDK and TPK games do you see?

pwsiegel · December 28, 2023, 1:30pm

One thing coming to mind is that autoscoring works better at your level. Less at the opposite side of the rating.

This is a reasonable point - I’d agree that AI scoring is not a good idea for players who have not learned how to evaluate the statuses of groups that come up in their games. But this would be mitigated by making the feature optional - perhaps even with a “advanced players only” or “use at your own risk” warning.

pwsiegel · December 28, 2023, 1:40pm

However, your rate of encounters with that particular troll seems high, suggesting to me that he is targeting you. Also, you may not have seen the bug because you have a higher percentage of wins and losses by resignation than, say, DDK players.

I’m certainly forced to admit that I’m drawing conclusions from biased data. But checking these trolls’ game records, it is clear that I am not the only player facing this problem. I can’t speak for the others, but for me it’s annoying enough that every time it happens I’m one step closer to giving up and moving to Fox, where I don’t have to bug the moderators every other week. (Not at all, in fact!) As far as I can tell, the only substantive difference between OGS and Fox along these lines is that Fox lets you force AI scoring if somebody tries to stall.

pwsiegel · December 28, 2023, 1:45pm

Could autoscore just be applied to all games where two players cannot come to an agreement?

This would be superior to the status quo for me, even if the autoscoring algorithm has bugs!

hoctaph · December 28, 2023, 1:49pm

Since it seems to have been missed, I am bringing up

https://forums.online-go.com/t/anti-escaping-and-anti-stalling-features/49174

again.

_KoBa · December 28, 2023, 2:49pm

The current autoscore system is pretty reliable (tho it still makes mistakes occasionally) for games which are actually done and theres nothing game-changing that neither player can achieve by placing more stones, but for games that are still unfinished its far from being great. Another issue is that autoscoring system sometimes gives points for seki-eyes and teires even under japanese rules, which isnt usually a big error but can potentially screw up a close game.

The devs are just implementing some new systems to help with scorecheating, lets hope those will at least mitigate the impact of cheating. Sadly with scorecheating, there is only very limited time to act before the game ends with an incorrect score, and as you said, since mods are all volunteer hobbyists doing moderating on our spare time, we’re not able to keep an eye for new reports all the time :<

I think its worth noting that when talking about “score cheating”, in a lot of cases it seems to be something honest like this. Or maybe players do know how to score, but neither of them sees some cut or other aji which is obvious to stronger players and auto-scoring algorithm.

Or one of the players cant see seki or one-eyed group being dead, or doesnt yet know about bent-4-in-corner or something like that, and it can very well be the player reporting who has gotten it wrong.

Im not sure if its implemented yet, but yeah iirc anoek liked the idea of just forcing the autoscore if players resume the game from scoring, but both just pass instead playing any stones. I wonder if it would make sense to do the same for games where players keep marking the same stones dead/alive back and forth?

Feijoa · December 28, 2023, 3:44pm

Easy solution: only allow the force-autoscore option for area scoring rulesets until we have a better Japanese autoscore.

pwsiegel · December 28, 2023, 4:51pm

The devs are just implementing some new systems to help with scorecheating, lets hope those will at least mitigate the impact of cheating.

I’m glad to hear that software solutions are on the way! I’m sure there are ways to deal with the problem other than my proposal - I’ll keep an eye out.

Sadly with scorecheating, there is only very limited time to act before the game ends with an incorrect score, and as you said, since mods are all volunteer hobbyists doing moderating on our spare time, we’re not able to keep an eye for new reports all the time :<

Indeed, allow me to emphasize that this post is not a complaint or a criticism of the moderator team. You and the others almost always get back to me in less than a day, which is already an impressive response time. For issues that are best dealt with in near-real-time, I think we have to look for software solutions.

I think its worth noting that when talking about “score cheating”, in a lot of cases it seems to be something honest like this.

I’ll also note that I do occasionally encounter players who mismark groups due to an apparently genuine misunderstanding of the position. I don’t consider this to be trolling, and I don’t report this behavior - instead I add moves to clarify the position, which usually resolves the disagreement. (For this reason, among others, I think it is wise to only use area scoring in online games.)

pwsiegel · December 28, 2023, 6:24pm

Oops, sorry I missed this comment. No, the button did not come up in this game, and it usually doesn’t against score cheaters. I have only ever seen the button against genuine stalling, i.e. when the opponent plays moves that cannot possibly live.

PJTraill · January 3, 2024, 9:39pm

Summary

This should be the process: Explain for beginners — Autoscore — Correct the markings blindly — Resolve any disagreements — Resume play of necessary

Context

This type of cheating also came up in https://www.reddit.com/r/baduk/comments/18ru5ae/this_ai_scoring_is_bs/ , where, sadly, the OP wrote “My opponent (13k) marked my upper left territory as dead in attempt to score a win off me. But I wouldn’t expect the go community to have people like that”. This was, however, in the Sente app, and there may have been a race condition leading the victim to accept the position just when the malefactor had changed the markings. There, assuming the app receives the updated markings, I first suggested that the app could either:

notify the user of the change and repeat the question, or
change the marking back to what the user accepted and await a reaction.

I then went on to consider the rationale for a sensible design, saying:

It seems evident that:

The user must know what they are accepting. I.e.:

The score markings on which the user bases their decision must be those OGS considers accepted. So:

A change by the opponent to the markings shown to the user must be explicitly acknowledged by the user.

Otherwise it is impossible to be sure that they have registered the change.

Anything else is like letting my opponent change their move while I am making mine!

The problem is basically a race condition.

I do not recall gathering whether OGS requires a user to acknowledge a change by the opponent, but I fear not If not, OGS ought to change, which would also benefit browser users. In the meantime, you as app developer could make this problem a little less likely for your users by:

Not displaying changes to the markings while the user considers them.

If the user accepts the markings, warning them of any change before you forward the acceptance, so they have a chance to correct them and/or reconsider.

If the user makes the same change(s) as the opponent before accepting, silently use those changes.

There is still a danger that your Accept crosses a change, but the above makes the time window for that smaller.

An issue that cannot be addressed is resumption of play to resolve disagreements — IIRC it is a limitation of OGS that that is not possible, although it is the ideal solution.

(I gather from the preceding discussion that it is possible.)

You are understandably unhappy with the crude measure of blocking the Accept button, but, given the dodgy scoring behaviour of OGS (and the inherent difficulty of proposing a score for a very wide range of strengths), some inconvenience is inevitable. I feel that my suggestions reduce that to a necessary minimum.

Proposal

What I feel should happen after both pass is:

Explain: An explanation of the ensuing procedure is offered to the players.
- This should be tailored to beginners.
- For more advanced players it should be available but hidden / tucked away.
Autoscore: The position is shown with Autoscore markings (improving that is a separate issue!)
Correct: Both players make any changes to the markings they feel necessary, then Accept.
- These changes are only shown locally until both players Accept.
- It should be possible to repair any weird Autoscore nonsense by toggling the status of vacant spots between Black/Neutral/White, and that should be propagated to all other connected vacant spots.
If both players accept the same markings, the resulting score stands.
Resolve: Otherwise each player is notified of the other player’s markings and offered an explanation of the next step.
- It should be easy to compare one’s own markings with one’s opponent’s.
  - This could be done by letting one go back and forth between the two sets of markings, with the differing spots highlighted, perhaps surrounded by red lines.
  - It could conceivably also be shown statically on a single screen, but it might be hard to design a clear presentation of that.
The players choose among:
- Accept opponent’s markings
- Stand by own markings Edit: optionally amended
- Resume play
Again, if both players now accept the same markings, the resulting score stands.
Resume: Otherwise each player is told what the other chose and is offered an explanation of what happens next, and play resumes.
If both immediately pass again, use the Autoscore (however absurd!)
Otherwise (at least one player moves), continue until both pass again, then repeat the above.

I have not addressed timeouts for the players’ choices, network / escaping / stalling issues, time allowance for resumption of play, whether/how to implement the Japanese rule that the score be based on the position before resumption of play — I presume that a lot of that is already in place.

hexahedron · January 4, 2024, 5:10pm

It was suggested somewhere that the OGS API call that accepts a score result also involves the client sending the proposed scoring as part of that call.

Does the OGS backend server in fact reject the mutual acceptance if the proposed scoring sent by the two player clients contains a difference?

If so, then it becomes possible for the client to implement good safeguards. Not only can the client lock out the button briefly upon receiving a scoring change from the server when the other player changes something, or similar, but the client can also send its current scoring when the current player does click the button, so that even if there is network lag and the other player has accepted a changed scoring that the current player’s client hasn’t received yet, the server will still detect the mismatch and reject it.

slowthought · January 8, 2024, 9:09pm

How do you identify someone so quickly? IP address?

Conrad_Melville · January 8, 2024, 9:18pm

Ban wars occur with dedicated trolls. IPs alone are unreliable, but there are other “tells” available, which I am not going to discuss.