The integrated AI Review feature for OGS

Looking fine on Linux (64-bit)/Firefox 66.0.

graph

1 Like

But this would mean that we have in this case many moves (those with letters and no green) that are better than that one but way less explored.
Does it make sense?

1 Like

What you’re proposing is a refinement, which I’m afraid to discuss right now because it may detract from the more pressing issue that the underlying data seems to have errors. Your comment is also not specific to my table. If you’re saying that winrate change should be “curved” before being interpreted as mistake severity, then why not bring this up in the context of the “Top 4 moves” feature?

If that’s true, then you might have found the bug that I predicted exists. If winrate change is based on 1 playout and not 200, then it means that 99.5% of LZ’s effort is being discarded.

You asked if people agreed with your table. If you don’t want to discuss it now, that’s fine too. I thought your ideas were interesting and relevant to the upcoming AI review feature. A system that judges moves as questionable/mistake/blunder would be really cool.

(Emphasis mine.) For game changing moves, using the raw number makes sense to me.

Sure, we can discuss it. First of all, I would distinguish between negative changes and positive changes. Negative changes correspond to player mistakes, and positive changes correspond to LZ confusion. Right now they’re all combined under “Top 4 moves”. After the bug fix, LZ confusion should drop below 10% and rarely appear as a top changing move, so maybe it doesn’t matter, but ideally they should be separated as “Top 4 mistakes” and “Top LZ confusion” because they’re really different. The Top LZ confusion section is optional because it’s more about LZ than about the players, but maybe it’s interesting. Theoretically it should contain positions with complicated fights or ladder assessment errors. Another suggestion would be further separating “Top 4 mistakes” as “Top 4 white mistakes” and “Top 4 black mistakes” in case someone only wants to look at their mistakes and not their opponent’s.

Now, what you’re asking is to classify negative changes as questionable/mistake/blunder. I proposed simple cutoff values, but you’re right that this is not perfect. If we want to do better, then a natural approach is to come up with a formula for “mistake severity”, and then apply cutoff values to that. “Curving the table” probably means using f(winrate_after) - f(winrate_before) for some curving function f (to be determined). Maybe the formula could also involve the move number, in case we need to make adjustments depending on the phase of the game (opening, mid-game, or end-game). This is tricky because these phases don’t necessarily occur at a specific move number.

What I wrote so far is still pretty open-ended. You mentioned trying to emulate point-loss. This may be a good starting point since we can apply a scientific approach to it (by doing experiment in LZ). Alternatively, maybe someone with a lot of experience reviewing games with LZ will have some ideas.

3 Likes

I have asked this question and had it answered this way: the %-to-win that makes you think that the move is “better” is only determined one way (let’s say it comes from the network determining position-goodness). It is like “the % to win if you don’t read ahead”.

The actual best move to play - the one the bot will chose - is the one that had the most subsequent best moves following, which is the one it has explored the most.

There appear to be some logical holes in this explanation, I’m just telling you what I read and hoping someone else will leap in and correct or fill the gaps.

EDIT: actually, let’s have a separate thread on that topic.

2 Likes

Many thanks for your answer! There is still something I don’t understand. When you say that LZ has not seen black 43, how come is the first choice it recommends after white 42 precisely black 43?

After white 42, odds for black are 48.5%, OK, but LZ recommends black 43 which, when played, changes the odds dramatically in favour of black. So I don’t understand where LZ blunders here

1 Like

Will this also aid in identifying players who are using bots to gain unfair advantage during games?

1 Like

I did not know this existed. So lame

I think it will not be much help.

The only bot one can identify with it is LZ with the same (or similar) net as the deployed one. And this only if the user uses the bot for big parts of the game.

It’s not easy to distinguish between original and suggested moves.

1 Like

The LZ blunder is very apparent from your description. Intuitively, if black can force 90.1% odds by playing black 43, then the odds should already be 90.1%. Mathematically, in the theory of two-player zero-sum games, the value of a position is the maximum value of child nodes when the player gets to play, and the minimum value when the opponent gets to play. The relevent wikipedia article is https://en.wikipedia.org/wiki/Minimax . The LZ values in the preview severely disobey the max equation: plugging them gives 48.5 = max(90.1, other_values).

1 Like

holy moly this sounds awesome!

1 Like

hmm,
isn’t this going to be so intense that might overload the server?

It’s farmed off to other servers.

2 Likes

I see the thing is being implemented on the main site? Pressing the “Full AI Review” button gives me an Error 500 though…


Also, a suggestion, perhaps it is nicer if instead of giving the top three game changing mistakes for any player, to give each player his two worst moves.

2 Likes

I’ve squashed a few bugs recently, you shouldn’t be getting any 500’s now. Can you try again and let me know if it’s working for you? If not, shoot me a link to the game and I’ll get it figured out. Sorry about that!

1 Like

Actually, without clicking anything, it did compute the whole game afterall! But, I received 3 notifications for it.

I’ve tried another one of them, and it still gives me the 500… :thinking:

Pressing full ai review still results in 500.

Ooops, yeah I found another issue… ahh release day. Try one* more time?

* may be more than one more time

3 Likes