The integrated AI Review feature for OGS

I am in the “dan supporter” category and have been getting 1600 playout reviews so far. However, I just played a game against another site supporter, and it looks like the automatic review was done at only 200 playouts. I am guessing (though I don’t know) we might be paying different amounts and the system is taking the min category rather than the max category? Is there a way I can rerun the analysis with 1600 playouts? The game is: https://online-go.com/game/16226410

2 Likes

Oh interesting, I’ll delve into why it did that. The intended behavior is to do both and to auto select the max. I kicked off a 1600 review for you while I fix that bug :slight_smile:

5 Likes

This is a fantastic feature. Thank you very much.

Is it possible to download an sgf with the AI analysis? with the branches?

1 Like

percent question.

I think in the Alpha Go “world” the percents given are the percents of the played out lines that result in a victory. That’s very different from the percentage chance that a player would win with that first move.

Is this true also with the OGS implementation?

A few more comments about the graph issues: I’ve definitely also noticed some oddities like those pointed out above, where the exact percentages don’t quite line up with the graph, here’s another example (embarassing graph for me, but good example for a report; my opponent was many stones stronger). These two graphs are one turn / keyboard increment apart:

58 07

It looks like the right-hand percentages match the graph one step before. For this close issue, maybe there is some sort of smoothing being applied to the graph that makes exact alignment misleading?

Also, I’ve noticed that the vertical bar starts at turn 0 one step to the left of the y axis, as in this screenshot:

34

May or may not be intentional, but it could be contributing to alignment issues if not intentional. (Unfortunately this goes the opposite direction of the apparent first misalignment I pointed out above.)

Btw, since I’ve been posting a bunch of bug reports…I just wanted to also say that this is an amazing feature and I really appreciate all the work you’ve been doing on it; I also do some development for a game where features get tested in real time on live servers and I know how challenging this kind of thing can be.

5 Likes

This is on the short list of things to do, and it’ll be retroactive for any games you wish to download an SGF for.

The percent shown is the value output by the leela zero evaluator, so you might get a more in depth answer to your question there?

Thanks for the debugging! I reckon you’re right. I’ll get it fixed up next week when I sit down with the AI review code again.

Thanks! I’m glad it’s as useful as I had hoped it would be. Still lots of improvement to be done and bugs to be fixed, and yeah those bugs only ever seem to really shine in production unfortunately.

7 Likes

Graph seems to work wrong. Black had stable near 100% for second half of game, but graph dropped near to 50% several times.

2 Likes

Oye, thanks for the link, I’ve been trying to hunt down why the graph seems to be doing that sometimes

1 Like

I ran Lizzie across that same little section of the game (around move 140) and got the same shape graph (thought I only ran 100 playouts, so not identical).

I don’t think the problem is the graph, I think the problem is the bar chart above the graph, which in stone_defender’s picture is saying that the black winrate at turn 141 is 97.8% when according to the graph (which I think is correctly what Lizzie thinks) it is much lower.

3 Likes

It totally destroyed my confidence when I thought I was losing big time in a mid game and the post game review revealed I was leading all the time. :sweat_smile:

I checked in sabaki with 200 playouts 9006c708 and looked at console information from lz itself
move 140 w 1.78%
move 141 b 98.04%
2.55%
98.23%
4.63%
97.81%
1.74%
98.11%
2.08%
98.51%
3.12%
98.50%

chart above the graph is correct. Maybe Lizzie graph works wrong too

1 Like

Shouldn’t that tell you to have more confidence in yourself :wink:

1 Like

It will be fascinating if it turns out we are debugging Leela :astonished: :smiley:

I re-ran, and took a look at what is going on in more detail in Lizzie.

19

What we can see is:

  • Lizzie definitely thinks there was a 17% drop in black’s win rate, as shown by bar and graph

  • Lizzie’s bar is one move behind the graph (!)

I could speculate that Lizze and OGS are both getting misinformation from Leela - information that is inconsistent with what is on Leela console :astonished:

EuG

(I realised after posting that a more boring explanation is that stone_defender is using a different network that thinks black made no mistakes)

1 Like

Now this is useful information I’m sure @anoek would be interested in.

2 Likes

There is also another bug: I got 3 notifications that bot review ended
https://online-go.com/game/18175201
And there wrong graph again


when in endgame I no way have any chance against 2k, it tells that white is behind.

I use same network and playouts as OGS
I checked this game too, bars are correct

and please don’t talk about lizzie, its just another complication, off topic
problem is about pure leela zero and ogs only

2 Likes

This should be fixed soon…

FWIW. I will talk about Lizzie if I feel it’s relevant.

The relevance is that Lizzie accesses Leela using similar interfaces to OGS. If Lizzie experiences similar problems to OGS, it sheds light on where the problems are coming from.

EuG

2 Likes

Thanks for the followup report, I had hoped I’d solved the notification thing… I’m trying to get all this fixed up today.

3 Likes

OK the issue where the graphs were showing data inconsistent with the bars above has been fixed. I was plotting the wrong output value from the ai, but i had both in the data blobs so all graphs past and present should look correct now.

8 Likes

Well, it is certainly consistent now… But fixed?

I checked the results in my reference game from my previous post, and it shows the same incompetent analysis.

Could you please verify again that the output actually comes from the strong network analysis? :slight_smile:

1 Like