Dwyrin is testing the AI / Suggestion for improvement

I agree like everyone here: nobody talked about teachers.

It could.
But right now it says “look at this big changing move” and you look at the board and see that the player did actually the move that is both green and “A”. It’s meaningless.

We are all expecting too see mistakes but we don’t (well sometimes we do, but very often the green + “A” overlaps the triangle).
Is this one the “wrong expectation“?

Isn’t it strange that someone could say: “hey, buddy, you did a mistake and that’s exactly the move I would’ve played and the best one in my opinion!”
Isn’t that nuts?

1 Like

It has never been presented as analysis of mistakes, and is explicitly not coded as such.

It is the moves that had the greatest impact on the game.

So to correct you example
“Isn’t it strange that someone could say: ‘hey, buddy, you did a move that had a great impact and that’s exactly the move I would’ve played and the best one in my opinion!’
Isn’t that nuts?”

No, that is not nuts.
It is just the case that when looking from any one board position to the next, the biggest move made was a positive one, not a negative.

If you were losing, and you played an amazing move that put you well ahead…even if you later ruined it again that amazing move would be worth highlighting.

2 Likes

I think there was sufficient feedback in the AI announcement thread. The move +41.5pp in the example game was a big red flag. Before launch I’d expect either a fix or a successful rebuttal of the concern in the thread. Instead it was just… launched.

Before blaming the limited AI, one should make sure that it’s running correctly. The example game analysis was advertised as using 200 playouts, but its quality was certainly below that, so there is something to fix. If the AI is running correctly and OGS still can’t do a quality analysis for everyone, then how about performing a decent quality analysis of 1 game out of 20 at random (for non-subscribers)? That could be a better advertisement of the feature than a poor analysis of every game.

1 Like

If there was this amazing move available on the board, what does ‘you were losing’ mean?

5 Likes

There are no positive moves.
Win rate starts at 50%. Then you can only keep your win rate or lower it.

When you see positive delta means that white made a mistake. Negative delta means that black made a mistake.

The best changing moves are all blunders

1 Like

I think what nobody is talking about is that the 3 moves are essentially at first just a leela hunch from quickly looking over the game at like 1 playout or something, and then those 3 moves are analysed a little deeper.

Think of it like leela saying these 3 moves are interesting and possibly where you went wrong, but after looking at them a little deeper actually this one was good.

Edit. Here is anoek saying I think pretty much the same thing

1 Like

I have already found it enlightening. It may be better for kyu players than for high dans.

3 Likes

Well, it’s called the “Top game changing moves”, so you’d expect it to show three moves that had a great impact on the game. I think that perhaps at 4 or 5 dan, Leela will stall against opponents with these few playouts, but I’m still easily beaten by LZ even if she only play 10 playouts or so, her “hunches” are usually pretty close to dan level

Well, it is coded as such: it searches through the game quickly to find the move that changed the balance most, and then computes the top three of those moves to see if they actually were game changers. There is no second step if she finds out she made the wrong choice though, so it pretends to be the game changing moves, but doesn’t catch error even after she found out herself it actually was a good move.

This is not what it does either, I’ve seen it mark moves that are +0.3% or something. Virtually neutral moves so to say.

And from the perspective of game theory, this is not something that is really possible: if you were losing, there is no amazing move that will put you well ahead, that is the definition of being losing. If there were such move, you’re just winning, but made a wrong judgement.

We don’t want to show Leela’s wrong judgement though, we want to gain insight in the actual position. Otherwise we could’ve just as well picked a 10k level bot to do the job.

I tend to disagree, Leela’s moves are quite high level, so watching the suggestions she makes are often only understandable if you’re a high level dan.

Please don’t post misinformation. In the OGS feature, negative delta always means a mistake (whether white or black). This is easy to confirm by subtracting the winrates before and after.

In the first Dwyrin game analyzed by OGS, moves 64 and 138 (both even, so same player) show deltas of +16.9pp and -36.6pp.

1 Like

Positive delta generally just means that the bot misevaluated the previous situation (in terms of the usual Alpha-beta tree search that most people think about go with).

So really it’s just that there is some move this search uncovered that was not searched in previous iterations severely changing the overall results.

2 Likes

Yes, and the high percentage means that it didn’t really judge the previous position well, which is because the score for the previous position is not investigated anymore after the first hunch (right?)

Here’s an efficient way to improve the hunches: use Monte Carlo Tree Search

Let’s call it the Hunch game, and call the game that is being reviewed the Go game.

  • Possible moves in the Hunch game are selecting a move in the Go game. Winning the Hunch game means selecting the largest game changing move in the Go game.
  • After selecting a move in the Hunch game, the simulation phase starts, which is LZ doing another playout in the Go game for the selected move of the Hunch game, and the move before the move of the Hunch game (so this requires parallel LZ evaluations for each move in the Go game that stay inactive until another playout is requested).
  • A simulation is considered a win if it has a winrate delta (between the two winrates that LZ just computed) that is amongst the highest (actually lowest, since it’s negative) 3 of the whole Hunch game.
  • Choosing which move of the Hunch game to explore more is done by the same heuristic as in normal MCTS.

This will converge towards the three most terrible moves of the Go game in a way more efficient way than currently. What is being used now is giving a uniform search over the whole Go game, and then spending more time reading deeply without improving the Hunch game.

Using my suggested MCTS approach will both be a better heuristic for finding the optimal move in the Hunch game, as well as compute LZ’s deeper reading of those optimal moves.

1 Like

of course, in the case of a decreasing delta, it can be either

a) a suboptimal move

or

b) the same case of a mistaken hunch

problem is, in these reviews you can only guess that based on if you got their highest rated move or not.

But I like the system of determining the delta of “winrate of optimal move” and “winrate of actual move”

Hence my suggestion above for an improved way of finding the decreasing delta such that it is a suboptimal move, and not a mistaken hunch.

1 Like

Yeah I just added that I liked that solution in edit, :stuck_out_tongue:

Oh well, now everyone can know it wasn’t in the original post

1 Like

I don’t get it.

For what? And why MC?

Call what the hunch game if not the reviewed game?

Not Monte Carlo, I meant specifically the search algorithm MCTS, but then a variant of it (like how LZ uses an MCTS variant to find the best move)

With the reviewed game you mean the game that is being played, so the game of go the two players played?
I, however, want to pose finding the most game changing move in terms of another game, as talking about games is a good way of talking about these kind of algorithms. For example the MCTS algorithm above is described in terms of game theory.

I suggest to find the most game changing move by using a MCTS inspired algorithm, that does not rely on Monte Carlo random play in the simulation phase, but instead uses playouts by LZ to determine the winrate change.

I could describe it more step-by-step, if that is helpful.

I think my main problem at the moment is to understand what this other game is. I’m pretty sure it is not fully independent of the game to review, but the way you describe it sounds like it is widely independent to me.

Ok, let’s try to put it in other words, and describe it as an algorithm:

  • The goal is to find the most game changing move of a finished Go game.
  • First, we compute LZ’s estimate of the board at each position of the Go game, but we let LZ do only a single playout (i.e. it is a very bad guess)
  • Next we compute for each position X of the Go game (except the first position) what the relative winrate is, by computing the difference between LZ’s estimate of winrate for move X and LZ’s estimate of the winrate for move X-1 (making sure to look at the winrate of the player who played move X in both cases)
  • Now we repeat the following steps until the result is satisfactory:
    – Take the board position that requires more exploration (this is dependent on if the board position has not been visited by this algorithm often, or on if the board position has a high relative winrate; see the " Exploration and exploitation" section on the MCTS wikipedia page)
    – Let LZ make another playout in her estimates of the chosen board position, and the one before the chosen board position, and compute the new relative winrate
    – update the board position in that it now has another visit, and add the relative winrate to its total score (a high score means a high relative winrate)
  • After several of these rounds pick the position that has been visited most.

It’s basically MCTS, where the simulation phase starts immediately, and has been replaced by LZ doing an extra playout.

2 Likes

Talking about Dwyrin ‘s games, it would be nice if some mod would run the full analysis on those games.