The best AI move is -7.5 here. It’s sort of strange to me. Does it need to be close to zero? Does that look right?
I think it typically happens in sharp positions and with low playouts.
The higher the playouts the less likely it is to happen, but it still happens.
Essentially at this move in the game the AI estimates a particular score by looking at a few different move options.
Then one move further, when it puts all of the playouts into the position after the blue move it finds that score is off by quite a bit.
So that’s a level II review with 1000 playouts, and if we run a level IV on the game which has 12k playouts for comparison, it seems to drop down to -4
I think if you could run it for even longer, typically the estimate before the move and after the move would converge, and so the difference would be much closer to 0.
Ok, I understand! The low playouts analysis is the problem, but theoretically, I’ve got a Hane tier with 1000 playouts.
on AI Sensei blue circle is always “0” and any other move is written compared to blue circle
OGS writes something less useful
Let’s bear in mind that I only have the free tier low playouts AI sensei.
AI sensei might claim that the blue move is zero, but when you look at the evaluations:
what you end up noticing is that the evaluation still changes.
- The score after move 58 is 13.8,
- The blue move is played which is claimed to be -0
- The score after the move is played is 15.5 (so it’s kind of a lie)
- the next move is claimed to be a 2 point mistake and we see the score change to 17.5 which makes sense.
So AI sensei might claim that the blue move loses or gains no points, but actually it does, it’s just not displaying it to you properly.
Edit:
I also find it very hard to parse on AI sensei, because it looks like the move isn’t on the board but the score being shown is based on that move being on the board. It’s a design decision
I think the OGS one is more honest, so I like it
they just use different definitions and both use kata
it does not compare to previous moves, it compares moves on current position
I don’t understand what you mean.
You can see above that when a mistake is played like move 60, it says it loses 2 points, and the score changes from B+15.5 to B+17.5 which is consistent with a move by White that loses 2 points. It also says -2 on the hover.
My point is it doesn’t do this for the blue moves: whether they gain or lose points it always says 0.
If you prefer that behaviour then sure, but it doesn’t align with what is actually happening with the changes in score.
when something “loses 2 points”, it does not mean that “black had 17 points and then black has 15 points”
it means that if you play this other move, you will get score 15 on next move
and if you play blue move, you will get score 17 on next move
(for example)
it compares 2 possible different futures, it does not compares past and present
Sure, it does in theory compare everything to the best possible play.
There is a lot of situations where it does mean that, like if you could take two stones and then you play a move worth 0 like a dame or a pass. You just lose 4 points that you could’ve had and should have.
You could in theory say that all the score estimates should be the same as blue moves estimates. That’s probably true on really high playouts but it’s not always the case.
Let’s say we’re in position A, and most but not all of the playouts go into the blue move. Some small weighting will be given to the score estimates for the non blue move that could affect the output.
Then in position B where the blue move is played in position A, all the variations and playouts are now being spent to evaluate only the branches where this blue move has been played. In theory it could be looking one move deeper into variations, so it can be more accurate than one move before.
That depends I think on whether you imagine considering analysing a position statically vs trying to analyse a whole game.
If a game has been played there is a past and future that you can compare to. That can have blue and non-blue moves.
If the game has no past or future to compare to, then sure, the only thing you can really take abstractly is the score of the position which is going to be weighted more toward the blue move score.
That’s only true with enough playouts, where the score of the blue move in the current position doesn’t change if you move to analysing the position where the blue move has been played.
The silly example is when you have a ladder and you can only see a limited distance ahead (which maybe doesn’t apply as much to katago but let’s say LeelaZero etc). You might think that the best move is to espace the ladder if you can’t see the end of the ladder, but when you move one move ahead and you can see the end of the ladder you realise you can’t escape. So escaping on the last move was bad, but you only realise it one move ahead.
At least that would be my understanding of it.
The most dramatic example might be in really complicated games like Lee Sedol’s broken ladder game.
Let’s again use a low playout katago like AIsensei free AI Sensei | Hong Jansik vs Lee Sedol
From move 75 to move 96 they’re all blue moves but:
the score changes from B+9.4 to B+35.6 with only blue moves.
It basically makes no sense with low playouts to say that both a blue move loses or gains 0 points and at the same time, the score estimate has changed by 25 points.
That’s what I’m trying to get across. Playouts are limited because ai reviews need server time and running servers costs money.
The effect of limited playouts means that the score estimate isn’t always accurate, so trusting the blue move to be 0 point loss doesn’t make sense, if you can’t even trust the score estimate anyway.

if we play this bad move and then next AI move, score is:
if we instead play AI move and next AI move, score is:
43.8 - 16.8 = 27
that is where got from
If you can link your AI sensei review it would help so we work with the same numbers if we want to focus on that game.
But let’s say I look at your example with move 74. For me on the free version move 74 is a -26.6 mistake
fif you want to know where that -26.6 comes from, simply check one move before.
The score was B+17.5 after black’s ai move of move 73. I showed above that the tooltip/hover explains this.
Then white makes a 26.6 point mistake with move 74 which brings the score from B+17.5 to B+44.1 which is the score after White’s move 74.
44.1-17.5 = 26.6
It’s genuinely the difference in score estimate before and after the move.
Now you might find that it should be approximately the same as something like
or something more convoluted, but
A) that’s not even as close - you’re off by 0.2 in your example, and
B) in positions where the blue moves lose really close to 0 points, then you can play long strings of blue moves without changing the score and pretend like that’s the move or method you should compare it to.
Anyway I think AI sensei’s design choice was bad, because it’s not at all easy to interpret it the way it is.
Anyway, I want to stress that:
It is possible that the way the AI sensei code is getting the values to display is something like
game move score estimate - blue move estimate
and that’s why the blue move is always zero. I can’t say that that’s not what’s happening.
But I think if you can’t guarantee that the blue move gains/loses 0 points, then it doesn’t really make sense to say that it loses 0, even though you can choose to say it.
that is what I meant from the beginning
Sure, but I’m suggesting that it isn’t more useful, given the many examples above, to write that the blue move loses zero, and then on low playouts have the score change all the time when the blue move is played.
I think it’s depressingly ironic that we went from not knowing what we were doing to using these fantastic new tools - but without knowing what they are doing or what these evaluations mean.
I mean we know what they mean.
Katago explains what its parameters mean (to a certain extent) here
For example
scoreMean
- Same as scoreLead. “Mean” is a slight misnomer, but this field exists to preserve compatibility with existing tools.scoreLead
- The predicted average number of points that the current side is leading by (with this many points fewer, it would be an even game).
and the scoreLead is a mixture of Katago’s neural net that has learned to predict who is winning and by how much, but updated as some kind of weighted average by results from the tree search it does.
I mean I don’t understand the fine details and the code, but you can have a rough idea what’s happening.
We know points are points, and in some simple cases it’s relatively easy to see why you might lose a point or two here or there.
In other cases it’s much harder to see exactly why one move is 3 points better, but we wouldn’t have any good method to estimate it anyway ourselves.
On the other hand, when someone makes a closed sourced app, unless we have them explain exactly what’s happening in the background, then of course we don’t know exactly what they’re doing, we just have to take a best guess.
I think both approaches are valid.
Considering the last move’s score lead prediction as 0 and showing how the suggested next moves change that, tells the user more about why the lead prediction is as high as it is. That is important for judging the whole board situation and how comfortable the lead of one player is.
Considering the best next move’s score lead prediction as 0 and showing how other suggested moves compare to that focuses on where to play next and how bad it is to deviate from the move KataGo ranks highest.
Another approach would be to show the predicted score lead just as KataGo returns it. If you play here, you’re X points behind/ahead. It might be a bit hard to compare moves if one player has a big lead, because then more digits are shown on the screen, which is harder to process for our brains.