[poll] Balancing the new AI score estimator

zefutogi · October 27, 2021, 6:21pm

Solution: Analysis off by default + full and simplified (nerf proposal) estimator versions available as options when enabled. Add an eligible challenge column in custom games list for “Analysis” that shows off/full/limited.

LetsFightLOL · August 21, 2023, 5:33am

I think even with nerfs, it needs to be accurate near the endgame.

Just before the game was over, the score estimator told me that White had won by 4.5 points.
So I (black) resigned, and afterwards the AI told me that black won by 4.3 points, I:? ? ?
This feels so bad, I really hope it’s accurate in its estimates at least when both sides have clear borders.

Feijoa · August 21, 2023, 6:05am

Welcome to the forums @LetsFightLOL!

During the game you have the old non-AI score estimator. I believe it uses nearly random play to close the borders. It’s certainly not going to play an optimized endgame; you have to do that yourself.

And even if you computed it perfectly and knew you were losing by 4.5 points, you could still hope to win. I see 7 or 8 unfinished areas in that game, and at 7k I don’t have a clue about what order to play them in; I’d probably lose at least a few points in the endgame.

benjito · August 21, 2023, 6:28am

You should not resign a 4 pt game

JethOrensin · August 21, 2023, 6:43am

I do not know which is the correct/exact thread for it, but I am bit more worried that the AI analysis sometimes really dreams of some odd/unnatural moves to produce some spikes that make no sense. E.g.

According to the AI, it was move H9 that causes -94% win chances. So, I click on the variation and I got this image. Why would someone ignore move 3 to play a move that does not work, like 4? Why play 10, when responding to 3 is much more urgent and is worth a lot more points?

Even worse, the opponent’s response (H8, which actually defeats my overplay) also has a -94% according to the AI and even though some other exchanges follow and those moves in this variation are still possible later, the AI just never proposes/thinks of this again. Very odd.

benjito · August 21, 2023, 6:55am

This game is beyond my level, but I know AI will often suggest forcing moves even if they don’t matter. Sort of wasted ko threats, but maybe it knows there aren’t any kos coming.

Anyway, I just try to ignore those moves and stitch together the local variation.

Uberdude · August 21, 2023, 7:16am

A good lesson that your should count yourself and not rely on score estimators.

gennan · August 21, 2023, 10:50am

Perhaps estimations from the in-game score estimator should state that it should be taken with a lump of salt. It can easily be off by dozens of points and this is by design.

hexahedron · August 21, 2023, 12:26pm

Does OGS label when a given score estimate comes from a decent AI versus a weak random playout? Or does it look roughly the same in each case and requires the user to just know, or have memorized the details in some documentation?

Uberdude · August 21, 2023, 4:27pm

The big swing is because both players are missing the important point of q15: if white connects there then s18 becomes sente endgame (which is big to get for free) due to the follow up at r12 following the p12 push (so it does work: it creates the liberty shortage for r12 to work following q15 and s18).

JethOrensin · August 21, 2023, 7:45pm

While this is true (I had noticed that it and p16 were getting increasingly dangerous, I never realised that I could have played q15 and pose a significant threat) it doesn’t change the fact that the AI seems to make odd moves in many variations where Black plays there.

In the example, is really move 10, worth enough points to ignore move 11? It is not as if White gains sente with it, White still has to play R4 immediately afterwards. E.g. I honestly do not see how this would have been worse:

qnpnpmqppnp · August 21, 2023, 9:33pm

The rule is simple (players can only see a very weak AI during a game ; good AI is only for spectators, or for all after the game), but I don’t think there is any label no. It’s in the documentation, but I can easily imagine someone missing it.

Uberdude · August 21, 2023, 10:33pm

White did takesente with 10, by not taking gote to preserve the lower right points and thus stopping black 3 gaining points in sente. If you are behind this kind of mutual destruction endgame rather than obedient answering opponents moves might be best chance to win. But also these variations with only 50 playouts in total or whatever don’t have many left once you are 10 moves deep so don’t put too much trust in them. It’s mostly policy.

Groin · August 22, 2023, 10:37pm

When OGS got that new estimator with AI integrated, a debate arose about using it during the game. And for watchers too.

This took place here on the forum and the final decision is as you see now, mostly because OGS doesn’t want external help during a game.

LetsFightLOL · August 22, 2023, 10:49pm

If so, I’d like it to tell me a range, not an exact number.
If it’s estimating so inaccurately, I wish it could tell me black or white, win-x.5~x.5 something like that.
After all… most people who use it only care about whether resign should be required.
You know, I play the correspondence game.
I really don’t want to spend a month finishing a game that I’ve already lost.

Groin · August 23, 2023, 3:31am

That seems not so obvious to program anyway.

richyfourtytwo · August 23, 2023, 5:21am

Especially without giving away the ‘true’ value.

If that range should be more meaningful than the estimate today it should always include the true value (= estimate from a strong estimatot). You could always show true value ± something, obviously giving away the true value. Or you could give a randomised range, always containing the true value. Call that a few times and you’ll basically know the true value again.

gennan · August 23, 2023, 9:23am

The error depends on the quality of its life and death evaluations in a particular position. And this quality is low by design. The in-game score estimator may not even know about the concepts of groups and group statuses.

qnpnpmqppnp · August 23, 2023, 9:26am

You have ample time to count by yourself in correspondence though. If not an exact count, at least just a rough sanity check of the estimator since anyway the difference must be significant if you’re considering resigning.

stone.defender · August 23, 2023, 9:45am

it just shouldn’t give value. It should display “you are 20 or more points behind”, but don’t tell 20.0 or 100.0. Answer to 19.9 points behind and 100 points ahead should be same: “continue”
Estimator should be tool to know when to resign to not waste time, not to know score estimate.