Leela Zero progress thread

lightmap · August 2, 2018, 3:21am

Okay ELF v2 has about 90% winrrate against LZ current best

https://www.lifein19x19.com/viewtopic.php?f=18&t=15718&start=60

20-game match LZ15#160 v LZ15_ELF2 at time parity.

5 minutes per game (1xGTX1080), noponder, komi 7.5

ELF2 wins 18:2 (90% , 10 wins as W, 8 wins as B)

BHydden · August 2, 2018, 4:44am

I think our testing so far has shown often 20 games is not enough to see true difference

ckersch · August 10, 2018, 1:00am

For anyone curious, LZ progress continues unabated. While the increase to 20-block nets didn’t yield an initial strength improvement, predictions that the increased network size would allow for further improvement seem to have been correct. There have now been 2 natural promotions of stronger net-to-net trained Leelas with a self-play Elo approaching 12000.

Interestingly, there’s also an incredibly strong 40-block network that’s been trained from a combo of the Elf and Leela games, which is on par with the new version of Elf for strength. While the project isn’t quite ready to switch from 20-block networks to 40-block networks just yet, in part because 40-block networks are twice as slow, game data generated by the 40-block network will be used to speed up progress in the 20-block series, in a similar manner to how Elf data is being used.

Maharani · August 10, 2018, 6:37pm

Very fascinating. Always love to read updates in this thread, so please feel free to keep them coming

smurph · August 15, 2018, 8:40pm

And somewhere out there is a Deepmind researcher who lets AZ play the currently strongest open source bots, just for fun…

Pond_Turtle · August 17, 2018, 9:32am

So. are there more games played by 9p players against LZ on decent hw? Are we now securely in superhuman zone with LZ?

smurph · August 17, 2018, 3:37pm

I think we’re mostly there, but sometimes ladders still ruin the day even for the 40b networks.

If it was me, I’d just rename the project Leela One and teach it to deal with ladders. ;D

Kungfu_Panda · August 21, 2018, 1:57pm

I am not sure if everybody realises, but Google removed ladder knowledge in Zero, just to show it can self-learn all concepts including ladders. Not because technically it is better, but because it is more beautiful / elegant in terms of self-learning.

The Master version, had the concept of what moves where ladder breakers and ladder captures. Here is the original feature information, the information what goes into Master’s network to determine what are good moves, and who is ahead:

MASTER FEATURES - NOT ZERO _ SHOWN TO COMPARE

The Leela-Zero creator, has copied google’s method as close as it could. But there is nothing (technically) restricting to create a Leela-Almost-Zero version, who is ladder aware.

So, not being able to read ladders is a design choice, not a restriction because of some technique or method.

flovo · August 21, 2018, 2:44pm

I’m not sure, I would call it “removed ladder knowledge”. There are 2 input layers telling the neural network where ladders are and how to capture/escape them.
What is the remaining knowledge, the network has to learn about ladders?

Kungfu_Panda · August 21, 2018, 3:04pm

that table is from Master - not zero. Zero doesn;t have that. I was trying to show the difference not confuse

This is from the Zero paper, indeed no ladder knowledge

smurph · August 21, 2018, 11:03pm

Yes, we are aware.

SanDiego · December 23, 2018, 7:39pm

Leela Zero has now hit 13400 ELO.

Do we have any idea where it stands compared to the three AlphaGo versions?

mark5000 · December 23, 2018, 9:14pm

LZ is probably near AG Master in strength. See Deepmind’s chart below, where the x axis is the number in millions of self-play games. LZ is currently at 11.4 million self-play games. Since LZ uses the same algorithms as AGZ, more or less, the growth should be at least ballpark.

SanDiego · December 23, 2018, 9:41pm

Thanks @mark5000 and great point about comparing the number of games. But isn’t the axis the number of days?

And weren’t the FaceBook games fed to LZ? This might also impact the progress.

I was also wondering if the ELO of the AlphaGo versions has been recalculated based on the same scale as LZ.

richyfourtytwo · March 14, 2019, 2:30pm

Strange, even after the end of LZ 9 months ago it has gained about 2400 ELO points.

DVbS78rkR7NVe · April 25, 2019, 10:42pm

Golaxy vs LZ today:
https://home.yikeweiqi.com/#/live/room/17514/1/14564807

Pairings (1 vs 2, 3 vs 4 and so on):

DVbS78rkR7NVe · April 26, 2019, 2:04am

LZ won.

https://home.yikeweiqi.com/#/live/board/17523

I watched with my home low-playout LZ.

Chinese experts in the chat agreed that preliminaries are round-robin, so we should see some more games today. But we’ll see.

Here it is:
https://home.yikeweiqi.com/#/live/room/17526/1/14586414

BHydden · May 7, 2019, 1:07pm

@S_Alexander I got you fam, have 3 more replies on me! love the updates keep them coming!!!

Lys · May 15, 2019, 8:00am

I have a question for those who know Leela Zero and also Leela 0.11

I asked in another thread too but had no response.
So, if you think it’s OT here, please post your answer there:

What exactly “playouts” means for LZ? Is the same as “nodes” or “simulations” on Leela?

I can see Leela change its mind about the “best move” while doing more and more simulations, so I let it run for tens of thousands (30k, 50k, even more) on a single move before considering that done.
On LZ I can see people talking about 200 / 800 / 1000 playouts, so I’m not sure I understand correctly.

Vsotvep · May 15, 2019, 8:18am

[disclaimer: I’m no expert]

I believe the only difference with old-fashioned Monte Carlo Tree Search is the method by which it selects the next node to explore. In “classic” MCTS this is done by a relatively easy function that looks at how much some node has been explored and how well it’s chances are, but for LZ this is done by a function that uses a neural network, that is trained to spot the best move. It also uses the neural network to do the simulation, where it keeps playing until the game is decided and then updates the parent nodes. In classic MCTS this is often done randomly, or using heuristics that aren’t based on deep learning.

So then a playout is just the same as a simulation, but the thing with LZ is that it uses a much more complicated neural net, and thus one playout takes a lot more time (but is way more accurate).