The integrated AI Review feature for OGS

I checked in sabaki with 200 playouts 9006c708 and looked at console information from lz itself
move 140 w 1.78%
move 141 b 98.04%
2.55%
98.23%
4.63%
97.81%
1.74%
98.11%
2.08%
98.51%
3.12%
98.50%

chart above the graph is correct. Maybe Lizzie graph works wrong too

1 Like

Shouldn’t that tell you to have more confidence in yourself :wink:

1 Like

It will be fascinating if it turns out we are debugging Leela :astonished: :smiley:

I re-ran, and took a look at what is going on in more detail in Lizzie.

19

What we can see is:

  • Lizzie definitely thinks there was a 17% drop in black’s win rate, as shown by bar and graph

  • Lizzie’s bar is one move behind the graph (!)

I could speculate that Lizze and OGS are both getting misinformation from Leela - information that is inconsistent with what is on Leela console :astonished:

EuG

(I realised after posting that a more boring explanation is that stone_defender is using a different network that thinks black made no mistakes)

1 Like

Now this is useful information I’m sure @anoek would be interested in.

2 Likes

There is also another bug: I got 3 notifications that bot review ended
https://online-go.com/game/18175201
And there wrong graph again


when in endgame I no way have any chance against 2k, it tells that white is behind.

I use same network and playouts as OGS
I checked this game too, bars are correct

and please don’t talk about lizzie, its just another complication, off topic
problem is about pure leela zero and ogs only

2 Likes

This should be fixed soon…

FWIW. I will talk about Lizzie if I feel it’s relevant.

The relevance is that Lizzie accesses Leela using similar interfaces to OGS. If Lizzie experiences similar problems to OGS, it sheds light on where the problems are coming from.

EuG

2 Likes

Thanks for the followup report, I had hoped I’d solved the notification thing… I’m trying to get all this fixed up today.

3 Likes

OK the issue where the graphs were showing data inconsistent with the bars above has been fixed. I was plotting the wrong output value from the ai, but i had both in the data blobs so all graphs past and present should look correct now.

8 Likes

Well, it is certainly consistent now… But fixed?

I checked the results in my reference game from my previous post, and it shows the same incompetent analysis.

Could you please verify again that the output actually comes from the strong network analysis? :slight_smile:

1 Like

Not quite sure what you’re looking for Animiral, I can say that the service provided is the service advertised, but beyond that, you’re better at this game than I am. I took the liberty of re-analyzing that game with the 40x256 network in case you find that interesting and/or useful.

1 Like

Thanks for taking a look :slight_smile:
The new graph looks almost, but not exactly, like the previous one.


(new one above in blue, old one slightly lower in red)

The very interesting thing is that both runs show the same weakness, especially visible in the “problem area” around move 81. The drop shows way too late.

Since I already described the symptoms and reproduction steps in depth, I can only re-iterate. Move 81 is a severe blunder that should be highlighted by the analysis, but isn’t.

We can see that the implementation is broken because

  1. it severely misjudges the win rate
  2. the results vary wildly from my Leela
  3. much too frequent/high positive percentage changes

If you still insist after all the evidence, I must give up here.
Fellow users of OGS, do you see what I see? Please tell me I’m not crazy :slight_smile:

4 Likes

I’ve wondered about this myself. LZ is well-known to be superhuman in strength and can beat high dans, including original Leela, on just 1 playout (excluding ladder issues). Its value network has been honed on millions of self-play games and, in my experience, stays more or less constant when following expected paths. Breaks from the expected path are rarely if ever better than LZ’s move. Consequently, any adjustment in win rate is almost always down. That’s how good it is.

THEREFORE, it’s strange that, following Animiral vs. vegmandu starting at, say, move 24, LZ is constantly swinging the win rate in favor of the player who just played, by margins greater than 3%, suggesting both players routinely shocked it with good plays. It really does seem in the nature of a bug. The LZ on my local machine casts much more shade on my moves…

6 Likes

@mark5000 @Animiral I’m keen to understand (for my own benefit) and maybe even help, but regrettably I appear to be thick in understanding the problem you are describing.

The good part of this is that if you can describe it clearly enough for someone dumb like me to understand, then certainly someone smart like anoek will be able to tackle it.

Could you clarify if the problem the analysis that LZ itself is providing (which OGS is making available) or whether OGS appears to be introducing a problem?

When I run LZ under Lizzie on this game I get the same results that OGS displays.

If we take one concrete place in the game:

Move 26 and Move 27.

  • On Move 26 white places Q4.
    OGS tells us that LZ tells us that the current win rate for black is 39.7%, and that black’s next turn will improve black’s position by 1pp.

  • On Move 27 black places R3.
    OGS tells us that LZ tells us that the win rate for black did improve, and is now 40.7%. OGS also tells us that white’s next turn will make things worse for black, by 3.7pp

I think you are saying that this is not reasonable, but I don’t see why not.

Unfortunately people who can code can’t necessarily play or understand Go that well, so people who can understand Go really well have to explain it to people who can code slowly and clearly so that people who can code can make it happen :slight_smile: :smiley:

1 Like

I was wondering when the program reviews the game does it start from the beginning or end? I typically review from the end first as I do that for my own research projects with backpropagation.

So I was wondering how much the graph would differ if you did the reverse?

It doesn’t matter whether it goes forwards or backwards - each position is evaluated on its merits (as I understand it).

I think this is true because I can go to any position and do “start analysis” (in Lizzie/LZ) and it comes up with the same result for that position, irrespective of what has been analysed up till now.

4 Likes

To my understanding (and I could be wrong), they are saying the problem is that at one point the AI thinks black is winning, and then at another, the AI thinks white is winning, as during this change, there is a steep drop off, implying a blunder. I THINK the problem they’re reporting is that black played what the computer regarded as “best moves” through this “blunder period” and thus, the REAL blunder must have happened earlier, and for some reason the computer is either not picking it up or not displaying it.

going backwards from about move 180 or so I didn’t get the crazy scores the AI got in the game position at move 82

(if im reading the game scores correctly)


tested again to make sure and added the earlier moves too same result

1 Like

… if what you are saying is correct, then the criticism is of “Use of AI for analysis” rather than “OGS implementation of AI analysis”.

^^ This is what I have been trying to flush out: where exactly is the problem?

1 Like

As I understand it, in each board position Leela is taking a bunch of moves into consideration. Among those, it estimates the winning probability. After this, the winning probability of the original position would be estimated by the best probability among those moves (best for the player who is on the play).

Now consider a position where it is Blacks move, and Leela tells us that after the move played, Blacks winning percentage increases by a large amount. By my understanding, this would imply that Leela didn’t consider this move (or didn’t see how strong it is). I can certainly see this happening sometimes, but it baffles me how often this happens in the analysis given by ogs.

1 Like