Use AI results to decide Serial Timeout ranking results in Correspondence games

I agree with the points that @Uberdude makes, but I want to emphasize and argue that the original suggestion of this thread is based on a mistaken premise:

The AI does not calculate this chance, since it does not take into consideration anything about the strengths and tendencies of the players involved. Instead, the AI considers win rate with respect to its view of very strong play (the superhuman-level of the AI itself), that is, the AI estimates the win rate, if it were to play against itself starting from the given position. This is very different than a calculation with regards to human players that are much, much weaker.

Of course, @flovo touches upon this issue:

However, the disparity between the AI win rate and the actual human win rate is much larger and broader than that. When the AI estimates something like a 10+ point advantage for one color, it typically judges a very high win rate (98+%), since a strong AI could easily maintain that lead and the large margin lends confidence to its judgement (for hypothetical superhuman play in finishing such a game). On the other hand, mere humans routinely make 10+ point blunders (or a series of smaller mistakes that add up to such a large swing), which could make the outcome still up in the air, and possibly much closer to a coinflip.

Itā€™s easy to find many examples where the AI wildly swings (even several times) over the course of a human game:

image

image

image

image

image

Ultimately, I think that using AI to judge timed out games could possibly cause more harm to the rating system, by injecting noise in the form of prematurely called games. Iā€™d rather have those games be annulled, even though Iā€™m not a big fan of that either (as seen by my arguments against that in other threads), however, Iā€™ve softened my view on that, since it seems credible that the concerns about abuse could be addressed by punishing those that intentionally game that system.

5 Likes

i donā€™t know if your are new to OGS or just new to this topic, which was discussed many times, but let me summerize it again as the discussion is over 2 yrs old.

  1. you will NEVER get a win if you are the one that time out, no matter your are ahead or behind.
  2. Currently, the one who time out IN A SERIES lost the 1st game, and the rests are annulled. Many people have different feelings about this.
  3. the suggestion is to use AI. if one is behind and time out, then he loses the game. if he is ahead when he times out, then game is annulled as before.

this is not a question about is AI perfect or can AI predict the result of a game. itā€™s about using AI to resolve a situation: that quite a few people think annulling ALL games is not the best answer to time out. Let AI reward some games to the proper winner is an improvement over that.

Neither, I was an active member and admin of OGS back in the late 2000s when the unranking of consecutive timeout rule was introduced in an attempt to address the problem of people leaving the site for a while messing up the ratings. I read this thread but did not comment on that underlying aspect due to requests from the current moderators to keep that topic out of this thread, though it does seem rather restrictive as only by understanding the problem that rule is trying to solve and the pros and cons of it (is the cure worse than the disease? Has the relative abundance of innocent serial timeouts Vs purposeful abuse for rank manipulation changed? Does OGS now have greater moderator manpower such that a manual intervention in serial timeout cases is preferable to an automated system?) that an adjustment to use an AI arbiter can be well evaluated.

As @flovo correctly points out, adding such an asymmetrical rule will lead to deflation of the rank of the serial timeouter, which is exactly the problem the current rule is intended to avoid. So if you want to not annul the serial timeouts but still avoid the mass perturbation to the ranking system you need to award both wins and losses, which will be evenly distributed on average (in the case of innocent timeouts rather than intentional rank manipulation).

I know, I can even provide the historical context; if memory serves it was luke, a 5d who timed out to 3k which precipitated this rule.

On principle I donā€™t like using an AI arbiter, definitely not as simple as winrate. MAYBE some of the more refined ideas of using point margin and if the same player has been leading practically all game could be made to work, but it still feels too interventionist and likely to cause as many problems as it solves. I donā€™t like the asymmetric only lose rule as it defeats the purpose of the rule to avoid rank deflation. And I donā€™t like awarding wins as it creates perverse incentive to timeout to force a win on your opponent if you are leading enough.

This situation feels rather like there is a rule that you need to punch yourself in the face, and now we are debating about how to put on some gloves so it doesnā€™t hurt so much, rather than asking why we have to punch ourselves in the face and could we stop doing that.

Itā€™s also a large development effort and Iā€™d rather that precious resource be directed at long standing deficiencies in core functionality such as the inability to request an undo on your move Allow undo request on your move or the poor placement of the submit move and pass buttons.

2 Likes

so i guess is back to the no rank deflation (current system) vs all the way deflation (loses all) debate again, now with an in between deflation (AI)

but i agree, if the admin got more important things to do, then this is not a priority, as no one had brought up the topic for 2 years.

The legitimate application of this rule should be rare, and I, for one, would be more concerned for the well-being of my opponent than for the result of the game.

The real problem, I believe, is the abuse of the rule. And lack of information lies at the heart of the problem. Such abuse is rarely reported and only a little less rarely results in a warning (several cases I know of) or ban (in one or two cases IIRC). In most of the cases I know of, the abuse was discovered accidentally when a game history was being examined for other reasons.

One idea to fill the information gap is to automate a report system, so a system-generated report would be issued whenever the rule was applied. Then a mod could check to see if it was legitimate. Of course, they would probably have to wait several days to see the playerā€™s pattern of activity (or inactivity). Whether such a system would be practical would depend on the volume of reports.

1 Like

Hello, everyone. I am replying to this initial suggestion by Flovo at this time as I believe the idea from 2019 is greatly bolstered now, in 2021, by two significant OGS updates:

A) The ranking system has had a profound update that greatly improved the stability of the system so major shifts in rank are less an issue so counting timeouts should be considered, andā€¦

B) The AI system has just had a significant boost in capacity and power, even better permitting itā€™s practical use to resolve this issue.

My opinion: I agree with the potential effectiveness of this method to assist with timeouts in corr games. I agree with Eugeneā€™s initial reply here that, if a player times out of a game they were losing anyway (as determined by AI), then itā€™s an implicit resignation and should be ranked and counted.

Any new thoughts on this subject given the new OGS milieu? Thank you.

What about high handicap games, where white is behind by only 10 points at the end of the middle game when he times out?

The AI would probably say that black has a 90+% winning chance, but itā€™s quite likely that white would have won if the game had continued to scoring.

3 Likes

I am a bit afraid that such system, using AI for concluding games in case of massive time out would be an incentive to not go to the end in more cases as what we have now . (Btw use of AI to score a game under Japanese rules seems much more interesting to me to implement, see relevant topic)

I think KataGo is quite capable to evaluate 99.999% of games under Japanese rules correctly (if given sufficient playouts).