The integrated AI Review feature for OGS

mark5000 · April 30, 2019, 8:17pm

You asked if people agreed with your table. If you don’t want to discuss it now, that’s fine too. I thought your ideas were interesting and relevant to the upcoming AI review feature. A system that judges moves as questionable/mistake/blunder would be really cool.

(Emphasis mine.) For game changing moves, using the raw number makes sense to me.

frolag · April 30, 2019, 9:35pm

Sure, we can discuss it. First of all, I would distinguish between negative changes and positive changes. Negative changes correspond to player mistakes, and positive changes correspond to LZ confusion. Right now they’re all combined under “Top 4 moves”. After the bug fix, LZ confusion should drop below 10% and rarely appear as a top changing move, so maybe it doesn’t matter, but ideally they should be separated as “Top 4 mistakes” and “Top LZ confusion” because they’re really different. The Top LZ confusion section is optional because it’s more about LZ than about the players, but maybe it’s interesting. Theoretically it should contain positions with complicated fights or ladder assessment errors. Another suggestion would be further separating “Top 4 mistakes” as “Top 4 white mistakes” and “Top 4 black mistakes” in case someone only wants to look at their mistakes and not their opponent’s.

Now, what you’re asking is to classify negative changes as questionable/mistake/blunder. I proposed simple cutoff values, but you’re right that this is not perfect. If we want to do better, then a natural approach is to come up with a formula for “mistake severity”, and then apply cutoff values to that. “Curving the table” probably means using f(winrate_after) - f(winrate_before) for some curving function f (to be determined). Maybe the formula could also involve the move number, in case we need to make adjustments depending on the phase of the game (opening, mid-game, or end-game). This is tricky because these phases don’t necessarily occur at a specific move number.

What I wrote so far is still pretty open-ended. You mentioned trying to emulate point-loss. This may be a good starting point since we can apply a scientific approach to it (by doing experiment in LZ). Alternatively, maybe someone with a lot of experience reviewing games with LZ will have some ideas.

Eugene · April 30, 2019, 11:53pm

I have asked this question and had it answered this way: the %-to-win that makes you think that the move is “better” is only determined one way (let’s say it comes from the network determining position-goodness). It is like “the % to win if you don’t read ahead”.

The actual best move to play - the one the bot will chose - is the one that had the most subsequent best moves following, which is the one it has explored the most.

There appear to be some logical holes in this explanation, I’m just telling you what I read and hoping someone else will leap in and correct or fill the gaps.

EDIT: actually, let’s have a separate thread on that topic.

hqrpie · May 2, 2019, 6:59am

Many thanks for your answer! There is still something I don’t understand. When you say that LZ has not seen black 43, how come is the first choice it recommends after white 42 precisely black 43?

After white 42, odds for black are 48.5%, OK, but LZ recommends black 43 which, when played, changes the odds dramatically in favour of black. So I don’t understand where LZ blunders here

BHydden · May 2, 2019, 8:40am

liminal · May 2, 2019, 1:52pm

Will this also aid in identifying players who are using bots to gain unfair advantage during games?

hqrpie · May 2, 2019, 2:34pm

I did not know this existed. So lame

flovo · May 2, 2019, 2:36pm

I think it will not be much help.

The only bot one can identify with it is LZ with the same (or similar) net as the deployed one. And this only if the user uses the bot for big parts of the game.

It’s not easy to distinguish between original and suggested moves.

frolag · May 2, 2019, 8:33pm

The LZ blunder is very apparent from your description. Intuitively, if black can force 90.1% odds by playing black 43, then the odds should already be 90.1%. Mathematically, in the theory of two-player zero-sum games, the value of a position is the maximum value of child nodes when the player gets to play, and the minimum value when the opponent gets to play. The relevent wikipedia article is Minimax - Wikipedia . The LZ values in the preview severely disobey the max equation: plugging them gives 48.5 = max(90.1, other_values).

Panchajanya · May 3, 2019, 12:02pm

holy moly this sounds awesome!

QuanLoh · May 7, 2019, 3:33am

hmm,
isn’t this going to be so intense that might overload the server?

Eugene · May 7, 2019, 5:12am

It’s farmed off to other servers.

Vsotvep · May 14, 2019, 6:56pm

I see the thing is being implemented on the main site? Pressing the “Full AI Review” button gives me an Error 500 though…

Also, a suggestion, perhaps it is nicer if instead of giving the top three game changing mistakes for any player, to give each player his two worst moves.

anoek · May 14, 2019, 7:29pm

I’ve squashed a few bugs recently, you shouldn’t be getting any 500’s now. Can you try again and let me know if it’s working for you? If not, shoot me a link to the game and I’ll get it figured out. Sorry about that!

Vsotvep · May 14, 2019, 7:39pm

Actually, without clicking anything, it did compute the whole game afterall! But, I received 3 notifications for it.

I’ve tried another one of them, and it still gives me the 500…

flovo · May 14, 2019, 7:47pm

Pressing full ai review still results in 500.

anoek · May 14, 2019, 7:49pm

Ooops, yeah I found another issue… ahh release day. Try one^* more time?

^{* may be more than one more time}

Vsotvep · May 14, 2019, 7:54pm

“Analysis started”

Seems to work fine now! And a single notification when finished.

How many playouts does it do again? It’s pretty fast

anoek · May 14, 2019, 7:56pm

Yay! Playouts depends, you can see the table at the bottom of the supporter page: https://online-go.com/user/supporter

anoek · May 14, 2019, 7:57pm

(And we also have top of the line hardware dedicated to the reviews )