[poll] Balancing the new AI score estimator

UPDATE: 2021-05-17T06:00:00Z

Thanks for all the feedback everyone. The new KataGo Score Estimator will not be made available in any form to players while playing a game, nor to spectators coming from the same IP address (so one can’t simply open another browser/private tab to have access to it). For the time being we will still allow players to use the old score estimator when analysis hasn’t been disabled, just like it’s always been.

Looking forward, I think the longer term answer might be to replace the old score estimator with a proper scoring tool for use in analysis and review mode, one that makes it quick for the user to mark out territory and the status of stones, but is “dumb” in the sense that it doesn’t try and figure out life and death too hard. Some balance between convenience and “too smart” will have to be struck there, we’ll work that out when the time comes.

Rational:

At the time of this update the Nerf poll received 962 votes from 762 people. Because the poll allowed people to select more than one response, I did some additional queries against the database to try and tease out what to do by separating out the players that wanted some form of score estimation from those that wanted no score estimator ever, from those that selected a mixture of both or “I don’t care”.

45% Want at least one of the score estimator options but did not click "Do not ever allow" or "I don't care"
36% Selected "Do not ever allow" but not any other options
19% Selected "I don't care" and/or "Do not ever allow" and one of the other options

This would seem to indicate that the majority of people do want some form of score estimator. However the next poll with 554 unique voters and votes conveys a slightly contradictory message:

50% Want the score estimator disabled by default
42% Want a nerfed score estimator enabled by default
8% didn't care

So people want some form of a score estimator, they just want it disabled by default, which somewhat implies that they don’t actually want it in their games perhaps.

The other trend I noticed reading through comments is that several folks, including myself, changed their opinion from “Sure whatever if people want it let them use it” to “KataGo’s estimation is too good for this, even nerfed”, where the reverse was not true that I witnessed.

I think what this tells me is we shouldn’t have any form of katago score estimation for players. However, there is certainly a large fraction of people that do like and use a score estimator sometimes for doing some quick counting that sort of stuff, so for that reason I wanted to keep what we had for them, at least until I can make something better-but-not-too-good for those of us who don’t like counting. There are a lot of problems with the old score estimator, namely it’s a source of confusion for beginners which I don’t like, so I will want to eventually try and figure out something better, but for now I think this is probably the right balance: Old score estimator during play when it’s enabled, just like it’s always been, new AI score estimator for kibitzers and end of game scoring, and eventually make some better tooling to replace the need for the old score estimator.

[end update]


For those catching up, we have a new score estimator that’s powered by KataGo. This is new system is significantly stronger than our old score estimator, so much so that it raises the question about if and when it should be available. Yesterday I threw together a quick poll for some feedback and as a result a lot of good ideas came up and much discussion was had . You can play with the new score estimator by using the score estimator after a game is over, or during a review or uploaded game. Currently if you use the score estimator during the game, the old “bad” score estimator is used.

In summary, what we’re trying to balance is providing a useful tool for players and spectators, but not make it so useful that it’s like getting a hint from KataGo as to where the best place to play would be. Here is my current proposal, subject to further modification based on feedback .

Proposal:

Nerf the score estimator for players of ranked games while the game is in progress

While a ranked game is in progress, the players of the game will have a different experience than spectators or after the game is over. Specifically:

  1. The score estimator will only be able to estimate the current position. Specifically, players will not be able to estimate analyzed positions (so a player can’t layout a few stones in analysis mode and click estimate to get KataGo’s view on what that would do to the game)
  2. The score estimator will not show fractional influence, only a very hard view of predicted territory. You can see the difference illustrated in these two pictures:

    This, I feel, reduces the amount of hinting that the player can get from predicted territory, while still remaining somewhat useful as a general counting / estimation tool.
  3. For unranked games, the score estimator will not be nerfed. The score estimator can still be disabled entirely for these games if desired.

Spectators still have access to the “full” score estimator

  1. Spectators will have access to the “full” score estimator, meaning they can estimate analyzed positions and view the fractional influence estimations to facilitate good kibitzing.
  2. Spectators coming from the same IP as a player will have the same nerfed capabilities as a player, meaning you won’t be able to simply open up a different browser or a private tab to view the full score estimator. Players that work around this limitation and use multiple devices from different IP addresses will be considered cheating and thus will be subject to a ban.

With these modifications, the open questions I have are: Is the nerf a good idea, does it go far enough, too far, or is it just right? The second question is if we nerf the feature a bit, should it be available during ranked games by default?


Nerf poll

Please select all of the options you think would be acceptable for score estimator use by players in ranked games. Note the score estimator can be disabled entirely, this is for when the players have allowed the score estimator.
  • The above nerf proposal - restrict score estimation to only the current board state (don’t allow estimation of analyzed moves) and do not show fractional influence
  • Restrict score estimation to non-analyzed moves but show the fractional influence estimations
  • Allow estimating analyzed moves but do not show fractional influence
  • Full score estimator capabilities - Allow estimating analyzed moves and allow fractional influence
  • Do not ever allow score estimation by players at all during ranked games
  • I don’t care

0 voters


Presuming that the responses above indicate that we should allow some version of a nerfed score estimator for ranked games, should that nerfed score estimator be available by default for automatch and custom games?

Note: for automatch games, the default would actually be “no preference” - meaning that by default if a players clicks automatch, they’ll match with an opponent that has indicated a preference one way or another and that preference will be used - if both players have not indicated a preference, the default will be used, which is what this vote is about. So in any case, everyone should have a pretty easy time finding games regardless of their preference, what we’re voting on is the default experience for new players or players who haven’t specified their preference.

  • Yes the nerfed version of the score estimator should be available for games by default (similar to how things currently are)
  • No, the score estimator should be disabled by default
  • I don’t care

0 voters


For unranked games, should players have access to the full score estimator?

  • Yes, for unranked games let players use the full score estimator like spectators
  • No, unranked games should have the same nerfed score estimator as ranked games.
  • I don’t care

0 voters

22 Likes

I understand words available and not available
I understand by default and not by default
But “available by default” seems surreal to me

what that means for custom game?
If it will not “be available by default”, does it mean that I still can turn it on when creating a game?

2 Likes

Good work devs!

I guess “by default” implies that it could still be an option that could be changed. However, whatever is default will impact the relative prevalence.

On the other hand, I imagine that the default would also be what is used for ladders and tournaments (at least the automatic site-wide ones)? Or will that be a separate decision to be made?

Should the score estimator be available for ladder and automatic site-wide tournament games?
  • Yes
  • No
0 voters

Also please don’t make KataGo score estimation (or improved) available for free games. Many IRL tournament games are played as unranked, makes cheating more easy. We dont want that, do we? If we want to cherish a culture of learning and improvement, we shouldn’t be pushing to the direction of normalizing relying on outside help, being it the analyze board or full AI or only AI score estimation.

16 Likes

I think as a tournament director you should specify that in your game creation instructions. I’ve played in tournaments where they say make it unranked, turn analysis off, make sure it’s not private etc.

Ok some people might do it by accident, some intentionally to try to abuse the system but the tournament should have a plan for these cases. I don’t think it’s enough to say turn this nice feature off for people that want to use it when it wont affect the rating system, and it shouldn’t affect a properly organised tournament.

6 Likes

Estimate of analyzed moves by new score estimator should not be allowed because its way to get purely bot move.
But, I think it should be allowed with old estimator.

That brings me to a next question: Will the choice of one player enforce the choice of his opponent?

Also, if tournament organizers disagree about rules, who takes precedent? Will OGS have an overhaul 5 times a year, each time it hosts AGA, EFG, RGF etc?
OGS should have what it wants, and the game organizers should just host on a platform and manage the games in their tournaments as they see fit, on their own. The same way they claim to use their own tools to catch cheaters etc. Platform should be complimentary, not a substitute for organizers to ensure a good experience for the players.

1 Like

So what if OGS wants to host these tournaments?

If you are making a point, I fail to see it.

Honestly, this is just stupid.

Ranked games should replicate real life circumstances, i.e. two players sitting across a board. No analysis, no score estimator, no nothing. Just your eyes, brain, and stones on the board. Otherwise, what is the meaning of the word ‘ranked’, other than the rank of your skill level - unassisted. By allowing any of these functions, we are saying, here at OGS rank doesn’t mean much, it is an arbitrary number we assign to a combination of a player’s ability and various other utilities the server offers. If you want people to take your server seriously (and stick around), you have to take it seriously yourself first.

Seems like a whole bunch of pussyfooting around the obvious.

16 Likes

image

17 Likes

completely agree, and not a minor point

If the option is a shared setting (both play with SE or without) in unranked games, that could help for auto discipline in tournaments

2 Likes

I think we should take off the score counting too at the end, make it so you have to manually take captures off the board and rearrange the territory to count too.

We should also turn off the stone hover on the mouse cursor, and turn off any popups that forbid kos, superkos, suicides etc from being played.

9 Likes

Nice attempt at a rebuttal, but none of those things inform the player as to which moves to make. You might be able to take an example to the extreme, but that doesn’t mean it makes sense within context, which yours doesn’t.

5 Likes

I not rebuking that the players might be able to use a score estimator to decide on a move, I’m rebuking your idea that we should be replicating irl over a physical board play which I think is a bit silly.

2 Likes

Think the meaning of my post went over your head. The ‘replicate real life’ circumstances has nothing to do with the physical reality of the board, only the tools the players have at their disposal.

4 Likes

So you play a game and a stronger player walk nearby and tell your opponent “look, you can kill that group” and you think fine, he didn’t tell how.

3 Likes