Is spectators territory estimation supposed to be this bad?

Allerleirauh · April 13, 2023, 4:58pm

Game in question:

At this point in the game they are playing ko and white just played a ko threat.

Estimate score shows this picture:

And this is a very suspicious result. First, it’s highly doubtful black would sacrifice center. But then if black sacrifices to win the ko then top-left should be black’s (or they are gonna sacrifice for nothing?). What gives. And of course the value estimation is correct, black+6.5 or so, so it known black wins and yet the territory estimation gives an impression that white wins. Does it makes sense. It doesn’t.

triangle_fuseki · April 13, 2023, 5:09pm

AI estimation doesn’t show what score 1 simulation would have. It shows something like average of multiple different simulations. It may make no sense sometimes

Feijoa · April 13, 2023, 5:45pm

Maybe related:

Jon_Ko · April 13, 2023, 8:13pm

I agree that it is unintuitive. It looks like most territory is going to be white and yet black is said to have a lead. But note that the size of the squares is relevant too. Black has some potential on the right for example, the white squares aren’t that big there.

Probably there are some trades possible.

Imagine there are three equally big white groups and black can capture either of them, but capturing one will white allow to save the other two groups. What should the score estimator show? Each of the groups is more likely to be white than black, so everything is shown as white. But in actual play one of the white groups will end up being black territory in the end.

To answer the question in the thread title: No, the estimation is supposed to be good, and in most cases it is good indeed. But that doesn’t mean it’s always easy to understand, some positions are just complicated.

jlt · April 14, 2023, 4:56am

I don’t know precisely how Katago works. Possibly the network that estimates the score, and the one that estimates territory ownership, are independent, and the second one needs enough playouts to become reliable.
FWIW my weaker version of Katago gives a different answer after 1k playouts or so.
ScreenShot_14_04_2023_06_54_16
I don’t know how to get fewer playouts, when I press EstimationKata and stop it after a fraction of seconds I already get more than 1k playouts.

Lys · April 14, 2023, 8:23am

My poor understanding of the game at that point is: black is leading by 7 pts. It doesn’t matter if he wins or loses the ko, either way he’s able to win. I black wins the ko, white captures the bottom right corner and keeps territory on the right but loses the left. If black gives away the ko, he can capture the three stones in bottom right corner and expand toward the top or, as an alternative, capture two stones on the left border, which also leads to black win.
So, under perfect play, black is winning this game.

Thus a very white board in the SE is quite misleading.

In the above image the small squares aren’t “small enough” to suggest that black will eventually capture some white stones. Quite the opposite: the black stones in the middle are marked as dead, while they aren’t and won’t be.

It seems that the board mapping with black and white squares doesn’t actually match the outcome.
Sorry if just did reword the OP, but I don’t see any real answer to this statement.

Jon_Ko · April 14, 2023, 11:29am

The squares are just visualizations of underlying numbers. Those numbers should be easy to verify once obtained. If they add up to the correct result (black+6.5), maybe the visualization is misleading? Maybe the mapping from numbers to square size should be changed?

Jon_Ko · April 14, 2023, 4:24pm

Okay, so I found out that it is pretty easy to obtain the numbers, but they don’t seem to add up to a black win. Their sum seems to be about -20, which should mean white is 20 points ahead on the board (without komi).

Numbers:

{"ownership": [
	[-0.536, -0.524, -0.47, -0.517, -0.504, -0.614, -0.684, -0.698, -0.698],
	[-0.543, -0.441, -0.446, -0.449, -0.424, -0.799, -0.699, -0.643, -0.665],
	[-0.642, -0.925, -0.389, -0.38, -0.996, -0.996, -0.721, -0.608, -0.61],
	[-0.797, -0.949, -0.963, -0.973, -0.724, -0.995, -0.816, -0.519, -0.536],
	[-0.697, -0.944, -0.714, -0.709, -0.73, -0.684, -0.927, -0.331, -0.422],
	[-0.676, 0.825, -0.777, -0.714, -0.82, -0.813, 0.896, -0.415, -0.353],
	[-0.421, 0.778, 0.029, 0.457, 0.826, 0.27, 0.869, -0.425, -0.266],
	[0.253, 0.814, 0.632, 0.745, 0.759, 0.767, 0.857, -0.454, 0.04],
	[0.536, 0.688, 0.782, 0.804, 0.821, 0.81, 0.794, 0.826, 0.283]
]}

Numbers source:

thomazjunior2001 · April 14, 2023, 4:29pm

Everything is actually handled by the same network, just different heads. And the score estimation also improves with more visits.
Also, your score estimation looks wrong despite the ownership prediction lining up with the latest katago net (shown below), make sure the ruleset and komi are set correctly (I suspect komi on your end is set to 0). To get fewer visits you can use the auto analysis function and set a visit limit.

richyfourtytwo · April 14, 2023, 4:31pm

I’m not 100% sure they have to match up. The predicted total score and predicted ownerships may be independent output values. Also the ownerships are not independent from each others. Not saying I have an explanation, just saying I am not 100% sure if there has to be a bug or maybe our intiuitive way of interpreting the data doesn’t match with what it means.

(I’d still put my money on a bug though.)

Feijoa · April 14, 2023, 4:35pm

Interestingly the komi is not specified in that request to the score API. I wonder what it’s assuming.

jlt · April 14, 2023, 4:46pm

Analysis with 1 playout with my local 15-block Katago (N.B. The komi is necessarily set up correctly since it’s on the file I directly downloaded from OGS).
ScreenShot_14_04_2023_18_44_54

thomazjunior2001 · April 14, 2023, 4:57pm

It says 0.0 komi on your image, older versions of Lizzie make it a pain to change komi/ruleset. I use Lizzieyzy, I recommend upgrading to it if you can (updating KataGo to the latest 18B should also give you a ~1300 ELO improvement)
0 komi

triangle_fuseki · April 14, 2023, 5:03pm

imagine some territory that is surely white and more territory that is surely black, everything else is not clear
if we count graphics, black is ahead. It doesn’t mean black mole likely to win. Visualization not necessarily have to match up.

jlt · April 14, 2023, 5:04pm

No it’s not the komi, it’s the score (except that it probably didn’t compute anything yet since I used 1 playout). Although I don’t understand the exact difference with the number on the graph. These two numbers become 8.3 and 8.4 respectively after 10k playouts.
I didn’t bother updating everything since my version of Katago is already pro level, that’s good enough for my personal use.

thomazjunior2001 · April 14, 2023, 5:15pm

The score is already calculated by katago with one playout, just don’t expect it to be 100% accurate. I still think your komi is set to zero because your katago is estimating territory correctly but the score is way different. Either way it’s getting kind of off topic so it’s best not to continue, but if you want an easy KataGo upgrade you can install lizzie-improvements, it comes preinstalled with 18b. Download the non-rtx version, extract and you’re set Release KataGo-v1.12.3の適用と、CPU版エンジンの高速化 · hope366/Lizzie-improvements · GitHub

GreenAsJade · April 14, 2023, 11:12pm

Do you mean “hey, OGS developers, this is what should be used”?

I’m trying to ask “who is ‘you’ in your recommendation”?

If it’s OGS, can you clarify in what way this will be better?

thomazjunior2001 · April 14, 2023, 11:37pm

Do you mean me? If so I’m specifically speaking to jlt here, it kind of borders on off topic territory if I’m being entirely honest

Feijoa · April 15, 2023, 3:43am

I really think though that you’re onto something with the komi. How do you think KataGo could possibly be used by OGS without knowing the komi of the specific game?

nadoss · April 15, 2023, 6:34am

They say you can’t count the score during a fight. It’s not entirely true, but you definitely can’t use score estimator during a fight.