Dwyrin is testing the AI / Suggestion for improvement

Please don’t post misinformation. In the OGS feature, negative delta always means a mistake (whether white or black). This is easy to confirm by subtracting the winrates before and after.

In the first Dwyrin game analyzed by OGS, moves 64 and 138 (both even, so same player) show deltas of +16.9pp and -36.6pp.

1 Like

Positive delta generally just means that the bot misevaluated the previous situation (in terms of the usual Alpha-beta tree search that most people think about go with).

So really it’s just that there is some move this search uncovered that was not searched in previous iterations severely changing the overall results.

2 Likes

Yes, and the high percentage means that it didn’t really judge the previous position well, which is because the score for the previous position is not investigated anymore after the first hunch (right?)

Here’s an efficient way to improve the hunches: use Monte Carlo Tree Search

Let’s call it the Hunch game, and call the game that is being reviewed the Go game.

  • Possible moves in the Hunch game are selecting a move in the Go game. Winning the Hunch game means selecting the largest game changing move in the Go game.
  • After selecting a move in the Hunch game, the simulation phase starts, which is LZ doing another playout in the Go game for the selected move of the Hunch game, and the move before the move of the Hunch game (so this requires parallel LZ evaluations for each move in the Go game that stay inactive until another playout is requested).
  • A simulation is considered a win if it has a winrate delta (between the two winrates that LZ just computed) that is amongst the highest (actually lowest, since it’s negative) 3 of the whole Hunch game.
  • Choosing which move of the Hunch game to explore more is done by the same heuristic as in normal MCTS.

This will converge towards the three most terrible moves of the Go game in a way more efficient way than currently. What is being used now is giving a uniform search over the whole Go game, and then spending more time reading deeply without improving the Hunch game.

Using my suggested MCTS approach will both be a better heuristic for finding the optimal move in the Hunch game, as well as compute LZ’s deeper reading of those optimal moves.

1 Like

of course, in the case of a decreasing delta, it can be either

a) a suboptimal move

or

b) the same case of a mistaken hunch

problem is, in these reviews you can only guess that based on if you got their highest rated move or not.

But I like the system of determining the delta of “winrate of optimal move” and “winrate of actual move”

Hence my suggestion above for an improved way of finding the decreasing delta such that it is a suboptimal move, and not a mistaken hunch.

1 Like

Yeah I just added that I liked that solution in edit, :stuck_out_tongue:

Oh well, now everyone can know it wasn’t in the original post

1 Like

I don’t get it.

For what? And why MC?

Call what the hunch game if not the reviewed game?

Not Monte Carlo, I meant specifically the search algorithm MCTS, but then a variant of it (like how LZ uses an MCTS variant to find the best move)

With the reviewed game you mean the game that is being played, so the game of go the two players played?
I, however, want to pose finding the most game changing move in terms of another game, as talking about games is a good way of talking about these kind of algorithms. For example the MCTS algorithm above is described in terms of game theory.

I suggest to find the most game changing move by using a MCTS inspired algorithm, that does not rely on Monte Carlo random play in the simulation phase, but instead uses playouts by LZ to determine the winrate change.

I could describe it more step-by-step, if that is helpful.

I think my main problem at the moment is to understand what this other game is. I’m pretty sure it is not fully independent of the game to review, but the way you describe it sounds like it is widely independent to me.

Ok, let’s try to put it in other words, and describe it as an algorithm:

  • The goal is to find the most game changing move of a finished Go game.
  • First, we compute LZ’s estimate of the board at each position of the Go game, but we let LZ do only a single playout (i.e. it is a very bad guess)
  • Next we compute for each position X of the Go game (except the first position) what the relative winrate is, by computing the difference between LZ’s estimate of winrate for move X and LZ’s estimate of the winrate for move X-1 (making sure to look at the winrate of the player who played move X in both cases)
  • Now we repeat the following steps until the result is satisfactory:
    – Take the board position that requires more exploration (this is dependent on if the board position has not been visited by this algorithm often, or on if the board position has a high relative winrate; see the " Exploration and exploitation" section on the MCTS wikipedia page)
    – Let LZ make another playout in her estimates of the chosen board position, and the one before the chosen board position, and compute the new relative winrate
    – update the board position in that it now has another visit, and add the relative winrate to its total score (a high score means a high relative winrate)
  • After several of these rounds pick the position that has been visited most.

It’s basically MCTS, where the simulation phase starts immediately, and has been replaced by LZ doing an extra playout.

2 Likes

Talking about Dwyrin ‘s games, it would be nice if some mod would run the full analysis on those games.

Wouldn’t it be kind of rude though? Unless he mentioned wanting it in the video or something.

Would be rather interesting to see the difference at least.

1 Like

He just didn’t know how the feature worked and played two games to give it a try. He looked at the top 3 moves and then tried to run the full analysis but he isn’t supporter, so he couldn’t.

What could be rude though?
Games are public.

2 Likes

I don’t know, maybe I was overthinking it. Just don’t wanna force AI reviews on someone who did not ask for them.

Lately I feel that AI are taken sort of as a gospel even though I think we can’t really play the same way as them and thus a good move for an AI is not always good for a human.
Therefore I was a bit worried as not to be taken as “poking” at his teachings if the AI disagrees with some moves.

That said, if he tried for the review himself but was unable (I did not watch those vids yet), I am happy to run it :slight_smile: (and btw he is a supporter, just not on his teaching accounts)

So, here they are I guess :slight_smile:


6 Likes

As a counter-argument, Dwyrin reported the experience of the average non-supporter user.

That is what it is: if it needs to be fixed, then it probably should be fixed.

4 Likes

I guess that depends on how much one has the right to expect from a free service.

Well, if I’m honest, in my opinion it feels like in its current state it is more useful to turn off the free three moves than to turn them on.

It’s basically like a lottery that selects three quite arbitrary moves in the game, and then gives a complicated read out deeper than can really be useful, and leaving in several terrible suggestions.


Also, remember that this will negatively influence the expectation the average non-supporting user will have about the full review , thereby reducing the number of people who might have considered supporting to get some good review of their game.

6 Likes

Actually, I don’t think that expectations of the receiver have anything to do with it.

The real question is what the service itself is trying to achieve, and whether the features we’re implementing achieve that.

I think we are trying to achieve being a great, well featured place to play go, for all comers.

If that is true, then a feature that is broken should be fixed, irrespective of whether it is paid or not, and feedback from non-paying users might be best being addressed directly, rather than giving them selected samples of the paid experience.

(That being said, it’d be awesome to know if Dwyrin thinks that the full review fixes the problems he reported :slight_smile: )

GaJ

4 Likes