As I understand it, LZ chooses the move it explored most. But many comments on this forum do distinguish between the move which is explored most and the move LZ would play, implying they can be different.
My main sources for LZs move selection algorithm are the AlphaGo paper
It should play the most explored move, and generally this will also be the move with the highest winrate estimation. If a node has a higher winrate, but is explored less, then it will have priority over the other node anyway.
What can happen is that it finds out only relatively late that a move actually is bad (e.g. because of a ladder), and in that case it first needs to do more playouts in this bad note to get the score down before it starts considering playing less visited nodes. Hence after the score has lowered, it could be that there are nodes that are less explored, but got a higher score.
The thing is, those less visited nodes have not been read out as deeply, so they could be equally bad or even worse than the original high scoring node. It has more certainty that it’s evaluation of the most explored node is accurate. On top of that, the much explored node must have been better than the less explored node if you only read as far as the less explored node had been read out at the time the much explored node still looked good.
I do not know anything about the subject, but just for your information in a recent review of a game of mine, at a certain point of the game Leela’s most explored move (bearing letter C) was a ladder that did not work, and the full subsequent sequence clearly showed that the stones of the running ladder were dead. So I guess that Leela would have picked the move A:
As a human you simply count the ko threats for each colour.
The AI has to play out all variants for this fight. So it seems reasonable to assume that AIs tend to avoid ko fights.
To read out a ko fight involving 3 ko threats for black and 3 for white, the AI has to read through at least 36 variations with 18 playouts each. The resulting board positions look the same for a human, but for the exploring AI they are 36 different branches (board positions). All 36 branches have the same chance to win (they are the same board position) so the AI has to spend its remaining playouts equaly on all 36 branches.
For LZ with 1600 playous this results in 1600/36-18≈26 playouts per branch.
To read out a ko fight involfing 4 thrats per colour creates 576 branches and needs 1728 playouts (more than LZ has to spent).
A ko threat is a move plus a response which doesn’t really change the win rate for both player (ignoring possible future ko fights), so to play a ko threat immediately could look like the best move on the board. A move can only reduce ones win chance after all and the ko threat doesn’t seem to reduce it.
Besides if the AI explores the ko threat it halves it’s playouts by exploring 2 almost similar board positions (similar by chance to win).
Thinking about it, if its estimated chance to win goes down with each move it reads deeper, to play the ko thread looks better since the effective reading deeph is 2 moves lower.
On the other hand, LZs neuronal net could learn to play kos only if a ko threat is needed (if a ko is active), but it doesn’t seem to care.
Doesn’t LZ have a way to merge equal game states when they arise in two branches? If it doesn’t it should surely be implemented, as that will be a big improvement in general, not just for dealing with ko threats.
Having studied a few AI games, I strongly disagree with this. AlphaGo Zero and LZ handle ko fights at a scale that is far beyond what humans can read. I am not an expert but my understanding is that those AIs don’t have to play out all variants to reach a conclusion.
Check out my post here, where I share an ELF vs. LZ game I ran on my computer:
I don’t believe this either, and even if it was the case we humans would not be able to judge. What I believe is that when the AI throws away a ko threat it’s because it estimates that it won’t have an impact on the result.