Deepmind announces AlphaGo Zero: Learning from scratch

jgk · October 18, 2017, 5:54pm

They’ve trained a version of AlphaGo that doesn’t use human games as input and now exceeds the strength of the AlphaGo that beat Ke Jie.

SanDiego · October 18, 2017, 6:10pm

OMG. 3 days left to study all those games before the Cotsen Open.

mark5000 · October 18, 2017, 6:58pm

Find the AlphaGo Zero supplementary games here:

https://www.nature.com/nature/journal/v550/n7676/extref/nature24270-s2.zip

20 games: AlphaGo Zero 20 block vs. AlphaGo Lee
20 games: AlphaGo Zero 20 block self-play games
20 games: AlphaGo Zero 40 block self-play games
20 games: AlphaGo Zero 40 block vs. AlphaGo Master

“Block” means neural network residual blocks. See the Nature paper for details:

For online sgf viewing:

http://www.alphago-games.com/

Misjamig · October 18, 2017, 7:45pm

Great.Now they don’t even need our input anymore. We have become obsolete as a species.

Eugene · October 18, 2017, 9:24pm

I was hoping they’d do this.

I was kind of hoping it might decide opening with tengen is good

Conrad_Melville · October 18, 2017, 9:27pm

Lol

SanDiego · October 18, 2017, 9:38pm

Me too I was excited when I opened the first AlphaGoZero vs. AlphaGoZero game, but quickly realized that they were showing us AlphaGoZero’s baby steps.

One thing I was happy about - because I don’t understand that move - is that AlphaGoZero doesn’t respond to AlphaGoMaster’s 4-7 shoulder hit.

SanDiego · October 19, 2017, 7:56pm

Two game reviews by Haylee from the newly published set:

I especially like the first 5 minutes of the second review, when almost all the sequences played by the world’s top player go against conventional wisdom.

Eugene · October 19, 2017, 9:55pm

Disclaimer. I’m DDK. What do I know?

But I do find it fascinating.

I like how you said “conventional wisdom”.

One thing I’ve noted before (though it didn’t get any uptake/response at the time) is that the concept of a “good move” is a very relative thing: relative to the player’s skill and ability.

With the arrival of AlphaGo, I think this aspect of “conventional wisdom” is heightened.

Haley says at one point “[pros would play here but…] I think that we can be confident that this is a better move because AlphaGo played it”.

But obviously it’s only a better move if you know why and what to do next.

So this becomes a topic for the pros to figure out, and they might do that by analysis to work it out, or by simply copying that variant and seeing if it works out better and learning that way.

To slowly get to the point: this doesn’t mean that it’s better for the rest of us

Now, even more than before, our “conventional wisdom” needs to be layered with “for whom?”

Before, there was a certain element of this. You might say “DDKs should do this, but Dans do this” (*).

Now, though, I think there’s an extra “layer” of conventional wisdom that we need to be conscious of.

It doesn’t mean that “conventional wisdom for DDK/SDKs goes out the window”. It’s more like “if you are strong enough, there’s this extra layer to be aware of”.

Some moves might flow straight down, like playing the high small knigh in response to the low small knight approach to 3-4. But other things might remain a mystery only to be played by players who can see 100s of moves ahead, like tenuki away from a 3 stone stick.

Anyhow, I’ll keep trying to play Dwyrin’s Basics conventional wisdom, and let you gurus flow the knowledge down

GaJ

(*) There are even cases I’ve seen where “DDKs do this, SDKs do this other thing because they know blah, but Dans do the original thing because they know even more than SDKs”. But that is an amusing exception to pre-AlphaGo conventional wisdom.

C002 · October 19, 2017, 10:01pm

I like the graph (https://deepmind.com/blog/alphago-zero-learning-scratch/)

.
How long does it take to reach 9p, minimum 10 years? Can we extrapolate?
21 days = 60+ years
40 days = 120+ years

As with so many arts, trying to achieve something close to perfection is unattainable within one lifetime, maybe this shows us that the path is more important than the destination

Musash1 · October 20, 2017, 7:04pm

I missed your original “note” about this, but now I can reply to it here

And my view is that a “good move” is a good move – always. If anything, you might say, “relative to my opponents skill and ability” because it is my opponent who must find the correct continuation (at least initially).

So to say (as you did):

is not (in my opinion) necessarily correct. The actual burden is on my opponent. I know that when I play master games and try to guess the next move, I am (unfortunately) very often totally surprised by the (winner’s) next move. But then, when thinking about that move and looking at the complete board, I (fortunately) quite often can see/feel that this move actually is very good – it is just that I did not even consider making it!

Now in my future games (and this has happened) I sometimes am in a situation where such a move can be tried … and sure enough: my opponent is almost always caught between a rock and a hard place.

So, as you can imagine, I am against those who say, “DDKs should do this, but Dans do this” because, as I said above, a good move is a good move – not because of the kyu/dan level of the player making it, but just because it is good in the context of the game.

Best regards,

– Musash1

trohde · October 20, 2017, 8:37pm

But the problem is that the “context” means something different to the Kyu than it means to the Dan, no? If I cannot play the appropriate follow-up moves then maybe I should stick with something I understand?

I’m not decided on this, just asking myself …

BHydden · October 20, 2017, 9:48pm

I’m inclined to disagree. Let me offer some examples.
The 3-3 invasion is often a “good move” but if you don’t know the correct follow up, it can still die.
Making a 2 space base on the third line is considered “safe” but if you don’t know when and how you need to come back and make eyes when pressured, it can still die.

In both these circumstances my opponent might not know any better either, but the onus really is on the one trying to live to know the “right continuatio” rather than just the first move or two.

Eugene · October 20, 2017, 10:17pm

In fact, I’m struggling with this exact topic right now

Russjass · October 20, 2017, 11:12pm

I have had a very brief look through these games. They love the 3-3! What I dont understand is that if it considers the 3-3 invasion of the 4-4 stone such a powerful move, why play the 4-4? I guess they must play it inviting the invasion and welcoming the outside strength. So, considered even by alphago as early as move 6. Weird. Or even for white if black has an enclosure on the other corner?

Eugene · October 21, 2017, 1:25am

Dwyrin’s most recent video illustrates exactly this point. The move (3-3 invasion in this case) is only good if you know how to utilize it…

Faeriestorm · October 23, 2017, 8:31pm

Idk I’m… 1d kgs?

And I strongly disagree…

For instance, suppose you have the option of taking 5 points in the corner in sente, but then some random pro drops by and plays some stone that seems to you to make the opponent completely alive while wiping that potential… You won’t be able to see the followup to finish the group off, so you’ll have just simply lost those 5 points.

So there is context in terms of your own strength.

There’s also context in terms of your opponent’s strength… If your opponent is a lot stronger than you, then you don’t want to start complicated fights (particularly if you are taking a handicap)…

There’s also context in terms of who is ahead… If you are ahead, similarly you don’t want to start complicated fights…

There’s always context.

Musash1 · October 24, 2017, 7:57am

Thank you for the reply, @Faeriestorm, but I also notice that you make the qualification that the pro’s move “… seems to you to make the opponent completely alive …” (emphasis added by me). Now I agree that I probably will not be able to see the correct follow up to finish the groups off, but that is my weakness, not the move’s weakness. If we wish to discuss good moves versus bad moves in the context of the player’s strengths, then when playing against any opponent stronger than myself, one can make a strong argument that my best move in almost any stage of the game would be to simply resign.

So, in the above example, the pro’s move is “good” because it fools the opponent (or the opponent cannot find the correct reply), but in truth the move is actually not the best (as you stated: it only seems to me to be good).

This discussion is more of a philosophical one similar perhaps to moral relativism. The decision concerning the move (is it good or bad) becomes less algorithmic when approached from your viewpoint. And carried to its extreme this argument resembles the so-called meta-ethical moral relativism viewpoint (which would argue that no move is objectively good or bad, but must be considered and judged in the context of the strength of the players involved). But I really doubt that strong players (e.g. pros) would observe DDK players games and judge very many of the moves as “good” moves – some moves yes, but most probably not.

My argument is more closely positioned to the arguments of moral objectivism. I would say that in most (and perhaps all) human vs human games, your argument is very strong. In games of AlphaGo Zero vs AGZ, then I think that my argument is stronger. Humans play very often (always?) with the strength of their opponent in mind, and frequently employ so-called “trick” moves, whereas AGZ plays as objectively as possible, driven entirely by algorithmic principles and the statistical experience gathered in its vast collection of played games.

So who wins? AlphaGo Zero, of course!

– Musash1

omote · October 24, 2017, 8:43am

Interesting discussion. As often (always) on the topic of AI, there is a natural trend towards trying to find arguments along the lines of “yes, of course, the machine performs now better than humans in this task (game or whatever else), but it’s only a machine, and it can’t compare to the way humans do it, because humans behave in context with feelings bla bla bla”. But each of those arguments looks to me a pure defensive move. Such argumentation is always gote so to speak. We grab the shrinking territory of what makes us unique, different, and implicitly better than machines, as we did along centuries with the rest of the living world. We behave like animals, but wait, there is more. Machines can mimic us, but wait, there is more.
There has been a lot of things written about inscrutability of deep learning algorithms. No one knows, and from now on and forever no one will be able to know, what really happens in the machine, and how it takes decisions. Saying those algorithms do not take context into account is pure wishful thinking, the more so that no one can really define what this context is and how it is used by humans for better or worse.

Eugene · October 24, 2017, 10:24am

I think the key point (somewhat restating what you said) is that it all depends what we mean by a “good move”.

I think that your meaning (moral objectivism) is that there is some sort of scale upon which a good move is a good move no matter who is playing. I can relate to that.

So I agree it is not true to say “a good move can only be considered in the context of the players”.

The circumstances under which this is not true is when you are thinking of “if it is the ultimate player playing” - at least, I think this is mostly equivalent to the moral objectivism scenario.

So then all that remains is to clarify that when I am looking for “a good move” I do specifically mean one that is good for me "

Once you understand that this the context, there’s no longer any debate

GaJ