The integrated AI Review feature for OGS

As far as I’m aware, it does, LZ tends to play for a safe win over an overwhelming win. If she can remove bad aji while staying ahead instead of keeping it, but having a better chance of winning by a high score difference, it will usually choose the safe option. Changing the komi obviously changes when LZ starts playing safely.

I believe the old Leela wasn’t trained for custom komi though.

60/40 split is from 9006c708 network (15x192).

Something like 56.5/43.5 is from recent 40x256 nets.

2 Likes

It was mentioned in chat, I think it deserves a spot on record here: we could do with a way to turn off the AI overlay on the board. Once an AI analysis has been done, it appears at the moment that you can’t analyse the game without seeing what the AI thinks.

3 Likes

I am convinced that there are real technical problems with the AI review as currently implemented. The evaluation is not as good as advertised. This is why some of its outputs are so baffling.

I played this game recently where I started out with a generally favorable opening. Unfortunately, I soon learned that my opponent was great at fighting. :smile:
At a critical point, I made a 60% mistake and lost an important group in a snapback.

Black must save his 10 stones and threaten the 7 white stones below. My move was at the circled point. White answers at the blue point.
The full review by Leela on OGS does not indicate this 60% blunder at all.

This position is judged as 79% for black, close to my own analysis with Leela. Then…

  1. B m6 +11.1pp -> 89.6% (totally wrong!)
  2. W o8 +26.2pp -> 63.4% (still claiming black ahead)
  3. B o7 +4.1pp -> 67.5% (unbelievable)
  4. W p7 +13.9pp -> 53.6% (black is clearly not ahead!)
  5. B n12 -40.0pp -> 13.6% (close)

Only now does the evaluation reflect the death of black, seemingly blaming it on the tenuki.

I downloaded the same network 9006c708 to reproduce these outrageous results (with Leela 0.17). What I found is that Leela quickly and correctly judges m6 to be a mistake. After this move, black is around 20% with only a few hundred playouts, less than half of the 1600 that OGS claims to have used.

Whatever analysis is running on OGS’ servers, I suspect that the numbers get lost along the way and don’t make it into the final output. The percentages shown seem more fitting for a “first-glance” evaluation without any playouts.

The green color highlights are suspicious. Even after a few hundred playouts, my Leela has already invested most of its reading effort into moves A, B and C.

On top of that, I have to agree with some previous criticisms on the display of moves B… onwards. The fact that these moves are considered does not mean much. Even after a few thousand playouts, Leela still thinks that move F yields 50% for black. It only sees how bad it gets when I actually play it. This is because Leela is not interested in exploring the second and third best moves. At no point does the engine calculate how bad these alternatives really are. Only the percentage of the top move is informative.

If the AI review was working correctly, we would only very rarely see positive percentage changes. It appears like Leela is constantly mindblown by all our genius moves - not a great impression from a supposedly strong Go player :wink:

8 Likes

I was looking at something similar this morning. I think that the graph itself is correct, but it appears as if there is a bug along the way of getting those numbers out.

In my case, I was looking at a black mistake that shows as a spike downwards in the graph at move 79. Stepping through the analysis at that point, the evaluation for white, which shows above the graph, never gets above 51% in white’s favour, but the graph shows that white was about 75% to win at that moment. The deltas showing for each move seemed right though?

2 Likes

Interesting, I’ll look into this Thursday. I suspect the graphing library is doing something there, smoothing etc. Should be easily fixed.

5 Likes

I don’t know if this was mentioned before but I think it’s worth a note:

when you click on the AI’s choice (ie the green ABC) it will show you the AI’s sequence of moves.

But, if your next move (the triangle) is actually one of those AI’s choices then the AI’s sequence won’t show. It will just become next move’s analysis.

It would be nice if AI’s sequence could also be shown where yours matches the AI’s

2 Likes

My logic with not playing out the sequence when you click the triangle is because you have a bit more information available when you play the triangle, namely all the options generated for that position, so basically to get the equivalent information you’d click the triangle then click one of the variations you’re interested in from there.

4 Likes

Except when you’re not a subscriber and you don’t have following variations

4 Likes

But in that case the move shouldn’t be in the top 3 anyway.

3 Likes

Ah yes, that use case. I’ll ponder on that

5 Likes

Given that the AI review is tiered on supporter level, but also that games can be reviewed retroactively and also by people who didn’t play in the game…

Is there a possibility that someone with a higher tier will be able to override a low tier review?

Yeah, that system could use some interface improvement, but the way you do it is by selecting the top 3 moves, then clicking the full review button again, this will create a new review if you don’t already have one. The list is sorted by depth of the review, so in theory the best one should always be on top and come up first.

5 Likes

Oh I didn’t know that was the case! I’ll have to test that out :slight_smile:

I thought about it too and came up with this as an algorithm that would not take up more computation and get a better result for the best 3 moves:

I didn’t test it (I could, if I had time to spare), but I don’t see why it wouldn’t work better than the existing system.

1 Like

I’ve actually been attempting this for a rengo game I played with smog, Cactus Juice, and DROmedary: rengo

Before clicking:
47%20PM

And basically the same interface after clicking. The info says 1600 playouts, but I’m not getting the full review option.

that’s strange… i started a full review for you, but it’s weird you couldn’t see the button…

1 Like

thanks

1 Like

BHydden, mods can kick off reviews anywhere to help fix things up, otherwise you have to be a player in the game

3 Likes

I really need to learn to check my privilege :wink:

1 Like