Accidentally resigning while winning

Uberdude · September 22, 2023, 3:56pm

This thread is already an OGS classic.

Cchristina · September 22, 2023, 3:58pm

I can’t agree with this logic.
It reminds me of shops assuming a wandering customer is a shoplifter.

Half of it should go to a subforum named ‘sandbagging saga’.

Gia · September 22, 2023, 4:01pm

I think in a way there is.

Mods decide if a player would/could have won a game.

Not in cases when someone intentionally went the other way and lost (it’s unbelievable how many times I have to say I don’t mean people actually dropping a game, like I’m speaking an unknown language), but based on what AI says. So, if I lose because I made a mistake “I wasn’t supposed to do” the game is annulled.

I disagree with that practice and it has nothing to do with sandbagging and everything to do with not upset the ranks.

gennan · September 22, 2023, 6:11pm

I can’t answer with any accuracy, but I can make some educated guesses.

From …

(Unofficial OGS rank histogram (and graphs) 2022)

… it seems that on average some 30,000 games are started (and finished) daily in recent years.

For a back-of-the-envelope calculation, I’m guesstimating that roughly half of those games are ranked, so some 15,000 ranked games per day.

If roughly some 75 ranked games are annulled by moderators per day, it means roughly 1 in 200 ranked games gets annulled by a moderator (which seems far too low a number to me for any shenanigans other than just following up on user reports).

I have about 200 ranked games in my game history, so I would expect to have about 1 of my games annulled by a moderator.
I think this even happens to be the case (a game from a few years ago, where I reported it and the moderator handling the report annulled the game).

benjito · September 22, 2023, 7:25pm

The ratio of ranked to unranked is usually much higher than 50:50. I just checked the “Watch” page, and filtered on Ranked, there are 124 live/19.5k corr, but only 26 live/3.0k corr for Unranked.

gennan · September 22, 2023, 7:32pm

Yes, I was trying to err on the conservative side (and also account for games that get cancelled by the opponent or annulled by the system).

75 moderater annulments per day is probably an overestimate.

So probably even fewer than 1 in 200 ranked games get annulled by moderators. It may be as few as 1 in 1000.

BearAware · September 23, 2023, 1:46am

Just wondering, is it possible some players intentionally resign despite knowing they are ahead because it’s obvious to them that they will win due to current positions or their opponent’s lower skill level, but their opponent isn’t passing or resigning, and they are simply bored and don’t mind letting the weaker player win?

BearAware · September 23, 2023, 1:57am

Ok, I’m relatively new to Go and online gaming psychology, so forgive my ignorance here, but is it possible that there is actually no such thing as sandbagging, and that the term/concept is merely a negative interpretation of accidental or intentional resignation when winning (e.g., out of boredom when there isn’t enough of a challenge) for reasons other than the negative ones assumed? I mean, has there ever been a “sandbagger” who confessed to “sandbagging”? Could alleged sandbaggers be resigning despite its lowering of rank rather than to cause a lowering of rank?

mart900 · September 23, 2023, 8:41am

Can’t speak to the situation on OGS but there have been very obvious cases in other games I’ve played. There would be players insta-resigning say 30 games in a row to drop their rank, and then playing normally until they’re close to their real rank, repeat.

It’s assumed that the reason why is to have easier games against weaker opponents, but I’ve never seen anyone confess to it and say that’s why they do it. So their actual motivation is uncertain, though it seems highly likely to be that.

KAOSkonfused · September 23, 2023, 10:48am

Yes, there (weirdly!) actually are people (though it’s relatively rare) who play nonsense games or resign won games in a row to lower their rank (let’s say, they are 3 k and lower their rank to 8 k), and afterwards, they start playing games with 7 k and 8 k opponents which they of course win (and mostly do not find boring enough to resign, but I guess they enjoy it?), and then they make sure to lower their rank again, or try doing both simultaneously.

However, I’m not able to really explain the motivation of those users. I can somewhat imagine, but it mostly seems quite weird to me. But yeah, it doesn’t happen a lot - just, when it happens, it’s quite annoying for the sandbaggers’ opponents.

From my perception (don’t have numbers), only part of the sandbagging reports are such “real” sandbaggers. Another part of those reports are from people who suspect sandbagging but are wrong about it (of which some have good reasons for their assumption, but some others are just sore losers), and another, not small portion of sandbagging reports turns out to be AI users. So these reports are still very important, even if it’s not actually sandbagging.
Sometimes, someone uses AI and also sandbags to hide the AI use.

So, in my personal opinion, sandbagging behavior per se is not the most harmful misconduct that needs the most policing - I’d say I find harassment / insults and AI use the worst.
However, the other stuff (escaping, stalling, score cheating, sandbagging…) also annoy people quite badly, so it’s also important to take care of that - and, as I mentioned, part of the AI cases are “hidden” in the sandbagging reports, so that gives them imho a higher priority than they would otherwise have.

trohde · September 23, 2023, 10:55am

Sandbagging definitely exists, see here: Sandbagging at Sensei's Library
(If you check the page info, you will see that the page was created in 2001 already, it is a phenomenon that can be observed on most every Go server, and with other games also.)

The only question is how to distinguish sandbagging from resigning while being unaware that you’re not behind.

gennan · September 23, 2023, 11:42am

My take on it:

When someone is reported for sandbagging, I think it’s a matter of checking their game history and see if it shows a pattern of strange time-outs and resignations in won/undecided positions.

How strange were the resignations, given the apparent level of play and the position where they resigned? Did they have a decisive lead, or did it happen for no apparent reason (especially early in the game with nothing much going on), or might they appear to throw games intentionally by other means?

How strange were the time-outs, given time settings and the complexity of the position where it happened?

Is there a large discrepancy between their apparent level of play and their established rank? Does it vary a lot?

When such strange things (for their apparent level of play) only occur in a small fraction of their games, those things are most likely genuine mistakes/misjudgements/accidents and thus inconsequential.

But it becomes fishy when such strange things occur in a large fraction of their games, especially when it happens in bursts, particularly against (much) lower rated players. They may be purposefully manipulating their rating downward, as many sandbaggers do. Annulling those strange game results thwarts their efforts. That, and a warning, hopefully discourages them from continueing with this behaviour. And if this doesn’t work, a ban is warranted.

And ofcourse there will always be edge cases. Keeping an eye on those for a while will probably clarify which it is.

Conrad_Melville · September 23, 2023, 4:49pm

Games should not be annulled IMHO just because the putative sandbagger was ahead and then resigned the game. I don’t think I ever did that, and I don’t know any other mod during my tenure who did that. Similarly, if a sandbagger threw a game by deliberately playing bad moves, even absurd moves, I would (and did, in a few cases) leave it alone. Those circumstances are too subjective.

A narrower version of the first circumstance is what provides evidence of rank-manipulating sandbagging and leads to annulments: that is, as I have already stated in this thread, when the sandbagger resigns or escapes. usually with a huge lead, near the end of the game. These are actual wins that are abandoned, sometimes literally when one would pass. (Of course, if the winner abandoned the game because the opponent was stalling, in its technical meaning, that is a different situation. Such “frustration wins” are duly annulled.) As gennan said:

And again, if it needs to be repeated, decisions are, or should be, based on large patterns of behavior, not scattered instances.

That is true only for rank-manipulating sandbaggers, as I expect you know. For alt sandbaggers, other means of investigation must be used.

GreenAsJade · September 29, 2023, 12:09am

My impression is that 3 stones is a crushing gap. I recall seeing a table of “probability of win by rank gap”, and I seem to recall that a 3 stone gap was enough to make the chance of lower rank winning pretty rare.

Does anyone know where that table is - it’d be very interesting to have the facts on that!

GreenAsJade · September 29, 2023, 12:12am

No. It is very well established, with evidence of behaviours, that confirm that deliberate sandbagging is a thing. It “makes sense” as well - there is a psychological motivation for “winning” that underpins this behaviour. It is experienced widely in gaming communities.

triangle_fuseki · September 29, 2023, 12:29am

it would be useful if https://online-go.com/rating-calculator could tell it, but it still wasn’t implemented

GreenAsJade · September 29, 2023, 12:43am

Yes, that would be a great place to capture it

I’m sure the table was posted here in OGS forum a year or two ago, but my quick attempts to find it with search didn’t succeed yet.

I did find this though:

Rating walls for ranks

Rating difference in Elo based rating systems has a definite statistical meaning. It directly correlates to winning probability. For example, a 100 point rating gap means that the higher rated player has roughly 2 to 1 odds to win an even game against the lower rated player.

Elo rating gap Win probability

0 50%

14 52%

36 55%

72 60%

110 65%

150 70%

240 80%

366 90%

This system was invented by mathematician Arpad Elo to rate chess players (Elo rating system - Wikipedia)

I think this is the first step in getting what we want, but it doesn’t quickly tell us what the percentages are for OGS-(glicko)-rank-gaps.

benjito · September 29, 2023, 3:56am

Please point out the holes in this quick and dirty code/methodology, but I tried counting win % where white is roughly 3 stones stronger than black and it doesn’t appear to be as decisive as I would have thought! ~69%

import json

white_wins = 0
black_wins = 0
with open('/path/to/sample-100k.json') as f:
    for line in f:
        obj = json.loads(line)
        players = obj['players']
        white = players['white']
        black = players['black']
        if obj['handicap'] != 0:
            continue
        if 'rank' not in white or 'rank' not in black:
            continue
        rank_diff = white['rank'] - black['rank']
        if abs(rank_diff - 3.0) < 0.1:
            winner = obj.get('winner')
            if not winner:
                continue
            if winner == white.get('id'):
                white_wins += 1
            if winner == black.get('id'):
                black_wins += 1

print("white", white_wins)
print("black", black_wins)
print("ratio", white_wins / (white_wins + black_wins))

white 3102
black 1408
ratio 0.6878048780487804

Data set is the 100k JSON file found here: za3k - OGS Go game collection

But maybe this actually checks out. Here is some code that should convert between elo and rank and calculate the difference required to get a 90% win rate. If I’ve got this right, you need a whopping 15 stone advantage to get a 90% winrate. Doesn’t feel right, I could have a bug or some wrong assumptions (for example, rank-rating conversion probably not valid at the extremes, and elo != glicko2)

import math

P99 = 677 
P90 = 366
P70 = 149
rating_to_rank = lambda rating: math.log(rating / 525) * 23.15
rank_to_rating = lambda rank: math.exp(rank / 23.15) * 525

def rank_diff(rank, pdiff):
    rating = rank_to_rating(rank)
    return rating_to_rank(rating + pdiff) - rank

# chose -6 because that's around my rank
print(rank_diff(-6, P70))
print(rank_diff(-6, P90))

7.250310462842305
14.900352821593811

Atorrante · September 29, 2023, 5:04am

European Go Database’s winning probability tables

flovo · September 29, 2023, 5:24am

It’s the same. Glicko has the same spread as ELO by design.