My opponent left, I waited 29 min

OGS team values the players, no argument there. However just the fact the intentions are pure doesn’t mean the current policy is good.

I deem annulling the game a punishment when the wronged party (and most others?) has an expectation to the contrary. It seems a little condescending to tell people what is best for them. I think most people who see their opponents leaving the game would expect the game result being declared in their favour. The problem lies in the decision not overlapping with a reasonable expectation by the wronged party.

How to determine what is to be considered artificial or not? You are talking about misclicks as if they are established to not be a part of online gaming. Should mods start annulling games that seem to have misclicks involved? I know it is not the same but these current cancellations still require some judgement call regarding intentions and it is easy to get it wrong.

I disagree. Being on the side of status quo is almost always easier.

I am definitely one of the biggest oppositions when it comes to allowing sandbagging in any way. In case you were referring to me, I’d like to clarify my arguing against these games being annulled is not to say I changed my views regarding that. I personally would prefer banning in cases of multiple strange resigns unless the user has valid explanation. However I don’t think annulling these games is discouraging to sandbaggers; on the contrary, it is discouraging to people who happen to face them, which leads to sandbaggers/trolls achieving their goals of frustrating people. And the disturbance of a user when they get their time wasted outweighs the benefit of having ranks fixed ever so slightly via these cancellations.

I went through the accounts and games concerning this particular case and I think it is a good example of how moderation should think twice before annulling such games and instead should take the chat/warn/ban option. Neither of OP’s opponents seemed to be clearly sandbagging to me. One had some strangely resigned games but in that account’s case OP had just captured the supposed sandbagger’s stones and swung the game in their favour (last 5 moves gained OP 30 points). [ 무지개달이 vs. Interestinggame ] It didn’t seem like a game that was resigned to lower rank. It seemed more like the supposed sandbagger had messed up and got mad. In the other case [ 围师 曦 vs. Interestinggame ], the opponent had many games in their history that were fitting their current rank [chirattaphon.b vs. 围师 曦, weeeeeeeeeb ]; again not a sandbagger from what I could tell. So, even with current policy, I would deem both of OP’s wins legitimate, due to the fact that their opponents weren’t clear sandbaggers and/or the resigns didn’t show clear intent to lower rank.

Perhaps I am wrong and missed clear signs of sandbagging but if not, I think the fact that this particular case had such issues, at least partly shows why the current policy can be problematic in more than just a theoretical way.

5 Likes

I support the idea that games are annulled if the result is not conclusive enough. I imagine this is a lot of work for moderators though, and I view it as a service they provide to the OGS go player community. Of course it is difficult to draw the line between which games should be considered non conclusive and which are accepted. There is no doubt in my mind that everybody has their own ideas in this context, so people are bound to disagree.

Regarding rating, I’ve always felt that many people have a strange approach to it. The idea of doing your best in a game in order to gain rating points often seems to lead to frustration if one loses rating points. People seem to think that their work did not pay off. I rather view it in this way: Rating is an approximation for my playing strength (which is an abstract concept), and even if my rating goes down, my efforts are not in vain, because I keep the experience. In theory, provided that one plays enough games, ones rank converges to some number, and a single win / loss does not change which number it converges to.

3 Likes

But that’s kind of the point: the game is declared in their favour, it’s just the rating points that are annulled. They have indeed validly won the game, but as far as using the game as a measure of player strength, it seems unsuitable.

Maybe this is the distinction that I wish to make: the ranking system is different from which player won the game and again different from whether one of the players is in the wrong. The purpose of ranking is mainly to assist in matchmaking, secondarily to track personal progress. For either of these purposes it is supposed to be accurate of a player’s skill.

Then, the player who won the game is clear from the game result. If your opponent ended the game at an uncertain state, you win the game, although it is unclear which of you is the better player.

Finally, whether you or your opponent is in the wrong in whatever is happening is yet another subject. That a game is annulled does not necessarily mean that the result of the game is invalid, nor does it mean that both opponent are doing something wrong. It only means that the result is considered an outlier in deciding who is the stronger player.

This is the real problem, as Gia pointed out earlier as well. I’m not sure what my answer would be here. I do believe that games that are lost due to any factor that is not inherently a reflection of the Go skill of a player, be it misclicks, timing out, or trolling, are unsuitable to measure player strength.

I’m not really sure, and as with most moderator decision, I would personally not annul games where I have doubt about it being unsuitable for deciding matchmaking. Due to our policy on misclicks / undo requests, I would not annul games with ‘obvious mistakes’ either, unless both players request it.

Hence less incentive to voice one’s opinion, right?

2 Likes

Morning! Since we are still going… [but life has intervened and it’s now lunchtime…]

And this is all I’m saying - is the policy correct?

I have a very high faith that the mod team makes the right call essentially all the time, given the policies that are in place (and being only human - for now). As I said above, I think the decision was correct given the policy, but is the policy correct? I’m not questioning the skills, judgement or good faith of the mods. I’m merely seeking to ensure that the “system” is a good as it can be.

Also, I’m very sorry if I’ve wound anyone up about this. I am very much enjoying the open debate and the intellectual challenge of thinking about an essentially intractable problem.

Maybe this should all be its own thread but too late now I suppose.

I would like to identify the various issues and drivers, the various interested parties and their views and preferences and consider it all in the round - too much for a forum post really, however:

We have the issue of ranking integrity and the desire for it to accurately reflect playing strength in order to deliver well matched games. This is related to the idea that annulling games is not a “punishment” to the winner because it’s merely making their rank more accurate. I could add a lot more but my main point here is that we have just had a whole readjustment of the entire ranking system to avoid, in part, people getting a more accurate ranking which was lower after they won a game.

So there is precedent for the driver of meeting user expectations being a factor as well as pure accuracy of the ranking system.

My other thought is about the level of the problem - how is this determined? It seems to me that many sandbagging victims will just be happy with the win and will not report it. So I suppose that people report sandbaggers at the point that the sandbagger has lowered their rank and is now playing and mysteriously beating people. I can see that those people feel they have lost unfairly and will complain. I can also see that it’s much better for one’s self image to imagine that you lost to a sandbagger than that you lost due to your own failing.

Another random thought - has the new ranking system affected the level of reporting/complaining? was part of it due to OGS having “tougher” rankings than other servers so when new people joined (experienced players with an established rank elsewhere) they see sandbaggers everywhere?

Anyway, I hate to prolong the discussion - I think it’s served it’s purpose. The opinions are out there and hopefully the powers that be can factor them into any future thinking about how to deal with these issues. [plus I can’t keep up any more!]

Again I really am not trying to target mods or their work - I hope it’s clear that I value them. Although I realise that this kind of valuing probably feels like I annulled one of their won games!!!

4 Likes

Admission: I haven’t read the middle part of the thread - it got all crazy hard work to follow :slight_smile:

And so something leaps out at me here:

Has a policy been stated? If so, what is it?

I’m not aware of a policy in this area - as far as I know, OGS has very few actual policies. Aside from those, moderation happens based on judgement and principles.

I did read some principles stated. One key one being:

“Ranked outcomes should improve the ability of the ranking system to deliver good skill matches”.

This principle allows for many different outcomes following a timed out game, depending on circumstances. Which in turn would appear to enable good moderation.

I would have thought that it would mean that if a game were escaped early on, it obviously should be annulled, since that game has no value in predicting the outcome of future games. And neither party could argue they were robbed of a real victory.

On the opposite hand, if a game were escaped where the escaper were clearly losing, I would not have thought that would be annulled, since the result (a loss for the escaper) does appear in line with predicting future outcomes, and the remaining player appears to have “earned” the win.

Hence judgement & moderation seems to be well guided by that principle.

What is the “policy” that I missed, and being debated?


Note: policies I do know about include:

  • You aren’t allowed to hassle about undos (documented in OGS wiki)
  • You aren’t allowed to abuse people in chat (documented in TOS)
  • You aren’t allowed to use AI assistance in any games (except Turing Tourney, documented in TOS)
  • You have to finish games properly. You aren’t allowed to just abandon them. (Pretty sure it says this in the OGS wiki)

I can’t think of any others offhand…

3 Likes

The policy I was referring to is that we give warnings before banning, which is kind of a moderation policy that is hidden from the public view. I believe this is a good policy, since many (maybe the majority) of the users who do something bad are not aware of doing something bad.

As a result of this, we have to annul sandbagging games to actually solve sandbagging problems, since otherwise these players keep playing at the wrong level after the warning for a while.

Is this the policy @teapoweredrobot was questioning?

I find it hard to see how this could be questionable: it is simple justice.

I thought @teapoweredrobot was questioning some “policy” in regard to how timed out games should be handled.

My understanding is that you had said that those would be handled according the the principle I stated (“Ranked game outcomes support good rank outcomes”), among others presumably.

… and that principle would lead you to this conclusion:

in the way that I described: if someone escapes by time out a game they were clearly winning, it needs to be annulled. This seems obvious.

What can be less obvious is whether a timeout was an “escape” or a genuine “ran out of time thinking” in a live game. That’s where I would expect moderation judgement to apply: surely you could trust moderators to understand that if a person timed out while they were ahead in a blitz game, actually they probably really ran out of time, for example.

I wonder if someone could skillfully state the opposite argument in a way that an impatient dummy like me could understand?

It is impossible to quantify such a thing. I can say from experience that many players do report and ask for annulment of their unjustified win, such as when an opponent has resigned a won game, or has resigned after move 2. I always thank them and commend them for their honesty.

2 Likes

Yes, it seems obvious, but this thread shows substantial disagreement with the idea that thrown games should be annulled. The dissenters do not agree with annulling on the basis of justice. nor on the basis of controlling sandbaggers and preserving the integrity of the ranking system.

I made this point early in the discussion, and my example of the 29 min. in this thread title was disputed. It was, you understand, only an opinion that the 29 min. was not genuine thinking time. Strictly speaking, that is true, but it flies in the face of the “reasonable doubt” that is the basis of U.S. (and I think much of Western) jurisprudence. To take that seriously establishes a standard of metaphysical certitude. If we accept that standard, then we need to stop moderating in escaper complaints that are not disconnection (i.e., at least half of all cases), in rank manipulation cases, and in botting complaints.

2 Likes

I don’t think there is disagreement about this. This whole debate started because it was not felt that the game in question was a “thrown game”.

Absolutely, but again I think the discussion is around there being other possibilities too. In this case it seems the person was neither escaping exactly nor thinking…

I will try and articulate understanding of the policy but don’t have time right now I’m afraid

3 Likes

I would describe the above as a policy and also the below:

Could you please adjust my rank / My rank says “?“

We cannot adjust your rank, but we do not need to! The system will take care of it very quickly on its own.
If you feel like your rank is way off from what it should be, all you need to do is play a few ranked games (preferably live, to make it quick) and it will quickly settle.


I think it’s clear there is a tension here

1 Like

Well, the [?] ranks becoming accurate is due to them having high uncertainty. For somebody with a lot of games, the system wouldn’t catch on that quickly. Also, since it’s a [?], they don’t have a rank yet, so opponents don’t expect the player to play at a certain level.

How far do annulments go?
I mean, if it’s to preserve that rank system integrity, you might annul games from years ago if a sandbagger hadn’t been reported until now?

There is a difference between questioning the substance of a policy versus challenging the character of the moderators that implement that policy. These two should not be conflated, but I believe that people here can make this distinction, and the nature of this discussion has overwhelmingly been about the former rather than the latter.

In the discussion about what a policy is or should be, I think it is natural to question parts that may seem vague or subjective, which leaves open uncertainty about how consistently it might be implemented, given that everyone is only human. I think we all understand that aiming to reduce ambiguity and subjectivity in a policy should not be viewed as a challenge against the judgement or character of the moderators, but rather just as an effort to improve the policy.

I think nearly everyone would agree with the general goal of keeping the rating system as accurate as possible, and that in some cases, adjustments by the moderators are necessary to maintain and improve the integrity and accuracy. However, the exact details of how and when intervention is carried out is essential to its effectiveness, and it is valuable for the community to understand these details, at least in order to have faith in the rating system.

The annulment of certain games can be viewed as discarding outliers with the aim of reducing the noise in the inputs of the rating system. In principle, careful selection could improve accuracy, however one needs to be very careful how this is done, since it is also possible to inadvertently introduce bias.

Let’s consider another thought experiment… Are players allowed to play RUI (Ranked Under the Influence)?

Suppose someone plays at a certain strength when sober, but also sometimes drinks before playing, which greatly diminishes their skill. Imagine this effectively creates such a disparity in their performance that it seems like sandbagging/airbagging (depending the relative ratio and whether their opponent encounters this player while drunk vs sober). Should this player be allowed to play ranked games while drunk? Should their drunk games be annulled? If the answer is yes for both questions, then these annulments would cause the rating to reflect only their sober performance, which would not accurately portray their general performance.

Of course, the state of drunkenness is just a rhetorical tool, and the broader concept that I want to express is that some people might naturally have highly variable performance in games, while others might be more consistent. While it could be possible to forbid drunken play, it does not seem possible to forbid other causes of inconsistency or variability (like players that are sometimes stressed, sleep-deprived, depressed, hungry, distracted, etc.). Further, if we go back and look at any player with a long enough game history, there are bound to be cases where they have mistakenly resigned or timed out in situations where they were winning, but would it be fair (for their opponent) to annul those games?

I feel that it is unproductive to characterize the issues in this manner. Calling a situation obvious is sweeping a lot of ambiguous situations under the rug. In particular, I think that deciding when something is a “thrown game” or when someone was “clearly winning” is not so straightforward. There are gray and uncertain cases in between the obvious extremes, and this discussion is largely about what to do about those cases.

Further, it’s troubling to broadly label those with different views as dissenters that do not agree with justice, controlling sandbaggers, or preserving the integrity of the ranking system. While I am questioning the substance of policy, please understand that it is not a challenge against the character of those implementing said policy, but rather a good faith effort to understand and improve policy. I think the vast majority of the participants in this discussion are acting in good faith, and that it is unfair and not helpful to characterize them as dissenters that disagree with the good aims.

7 Likes

You are conflating two different issues: (1) whether thrown games (stipulated as such) should be annulled, and (2) how to judge ambiguous cases. Unless there is agreement on the first issue, there is no point in discussing the second issue. Early in the discussion, I was trying to establish, using the Socratic method, the principle that a thrown game (stipulated as such, i.e, thrown, not possibly thrown) should be annulled. To my surprise, I found a lot of disagreement. Perhaps those who disagreed did not understand the distinction I was making. GreenAsJade was the one who characterized the thrown game example as “obvious,” and I agreed with him. Nobody is sweeping anything under the rug when it comes to ambiguous cases. Indeed, I believe I was the first who analyzed in detail the possibility that the 29 min timeout may have been an accident and possibly not rank manipulation.

This is a perfectly ordinary use of the English language. “Dissent” is a descriptive and historically not pejorative in the U.S. (except perhaps to the oligarchs). They are dissenters from the view I hold, and I am a dissenter from the view they hold. I have not questioned the “good faith” of the people who disagree. Indeed, they have presented a solid case for not putting the ranking system above the feelings of the people injured by an annulled game. I merely disagree with that position—not necessarily in this specific case—but if applied as a principle in thrown games (stipulated as such).

@Conrad_Melville since you are here, how far back do annulments go? I’d really like an answer to that.

I want to clarify that I am not conflating these two things. I do agree that there are situations where games should be annulled, including cases where a game has been clearly thrown for the sake of sandbagging. However, what I am questioning how to judge the ambiguous cases of when a game has actually been thrown, how a player’s intentions and natural variability factors into, and whether AI tools should be used to judge these positions.

I’m not challenging the interpretation of the 29 minute delay before the player timed out. I highly doubt that it was actual thinking time, but instead believe that they left the game (while keeping the browser tab open). Whether or not they left intentionally is another question, since (without further information) I think it could be possible that some external factor (like an emergency of some sort) could have pulled them away from their computer and cause them to accidentally time out. If the leaving was intentional, I think the motivation behind doing so is another question, which is difficult to ascertain without considering a broader context and history of past behavior. On the other hand, I do challenge the interpretation, based on KataGo, that the position as clearly winning (as I discussed in an earlier post). Depending on the specifics of policy and other circumstances, the judgment of whether the situation was a clear win, uncertain, or clear loss for the leaver is important, so that’s why I wanted to question that KataGo interpretation.

However, in this particular case that initiated the thread, I’m not even disagreeing with the final result, since other moderators have further expressed an opinion of sandbagging and @RubyMineshaft even mentioned that this behavior was considered in the context of many (25+) other games. I trust that the decision in this case was made reasonably, and the purpose of my ongoing discussion isn’t to try to overturn or change that case, but to better understand and possibly improve certain concepts and policy that are related to it.

I think framing this discussion with language like “dissenters” carries the negative connotation of viewing it as one team versus another, when it should be viewed as a collaborative effort among a community, which may have some disagreements, but all share some common goals and aims. Further, I think opinions on this issue are so varied and nuanced that people cannot even be so cleanly divided into two teams.

I think the following sentence goes too far in mischaracterizing the aims of those with differing views:

I do care about all of those issues: justice, controlling sandbaggers, and the integrity of the rating system.

3 Likes

In your 20+ years, you certainly seem to be using personal anecdotes as evidence and straw arguments a lot.

Half the problem is we now ALSO have to defend ourselves against the straw argument that we are attacking the mods in some way, while our issue is not that and also legitimate.

I don’t know about your parents, but I had a teacher who regularly misgraded our tests and used her power to shut us down. We can exchange personal stories all day long, it means absolutely nothing.

And, in a community like this, mods are more of a “first among equals” than “parents to discipline unruly kids” bar extreme cases. If you see it as the second, that’s your issue, but if indeed moderators see it is the second, it is a huge issue.

1 Like

When I started moderating, 2+ years ago, annulments extended only to the most recent 15 games, and the practical application was even less, because if a game was older than 15 back for ONE player, it could not be annulled. This was a technical issue, not really a choice. Some time later (less than a year, I think) anoek made it unlimited. With the recent rank change, the limit is 100 games back for players individually. However, I have experienced a shorter limit, which anoek is investigating.

1 Like

Sorry, I didn’t follow the whole discussion, so I’m not sure if I get this right: Would it be a possibility to suspend the decision if a game is annulled or purely won/lost for sometime, to see how the person who quit behaves? If shortly afterwards similar cases of escaping/rank manipulation happen they could be ruled out in a bunch. If not it’s some rank points of loss/win. ?