Thoughts following alphago zero

BHydden · October 20, 2017, 1:05am

Just looking to start a discussion.

Following the release of alphago zero and the revelation of how quickly it improved…

(yes I am aware humans and AI are different)

… Could an argument be made that purely playing and reviewing games with opponents of the same strength as you (the method alphago zero learnt with) is a more efficient way to improve than studying pro games, puzzles, joseki, etc. (the various inputs alphago master received and often taught by human teachers)?

trohde · October 20, 2017, 1:34am

Well, perhaps … IF you’re able to play with blazing speed, AND IF you don’t need to do anything else (like … eat, drink, work, visit the toilet, move, sleep), AND IF you have the awesome memory of a computer so that you can remember the millions of games you played?

BHydden · October 20, 2017, 1:38am

I feel like these only affect the total time taken and not the efficiency of the study method.

This feels like the crux of the matter haha

Eugene · October 20, 2017, 1:40am

I would go so far as to say “no, it is not more efficient”

The reason it is not more efficient is that we need help in ways that AlphoGo does not.

When AlphaGo discovers something it does so in a way that this discovery is fed back direct into the neural net, and that’s “it”. It now knows.

We don’t have that luxury - our learning process is more convolued and less direct.

That means that it is more efficient for us to leverage the knowledge of people who have gone before us, rather than rediscovering the wheel ourselves.

In fact, you could go so far as to say that when you look at us playing a game vs an opponent of similar strength, this is just like AlphaGoZero playing one game against itself.

It goes on and does millions of those, and learns from each one.

The only way we can learn from millions of games ourselves is by leveraging the games others have played … via puzzles, joseki and systems of wisdom like Dwyrin’s Basics.

BHydden · October 20, 2017, 1:43am

“If I have seen further, it is by standing on the shoulders of giants.” - Isaac Newton (1675)

Eugene · October 20, 2017, 1:50am

(parenthetically:

note: technically it is not this simple or direct for AlphaGO either, but simplistically the principle holds

)

MrCplaysgo · October 20, 2017, 12:51pm

Take into account the fact that alphago’s moves in the opening were less then optimal. This is not because of “new joseki’s” that it “discovered” but merely mistakes. Now that may seem like a bold claim, but when you think about it, it makes sense.
Things computers are good at:
Storing vast amount of data
Doing calculations

Both of these are great for reading, so alphago does tend to excel in the middlegame/endgame. However, storing large amounts of variations are not great for the opening, where we typically rely on strategy and rules to guide us.

Things computers are not good at:
Judging the value of positions.

This presents a major problem in the opening, where judging the value of positions is key. However, in the middlegame it is not such a big problem, because alphago can count/see when a position leads to a large moyo, territory, or kill.

Just my thoughts, and why I think you really shouldn’t study alphago or copy its moves.
-MrC

DVbS78rkR7NVe · October 20, 2017, 1:09pm

As I understand AlphaGo Master was praised for winning games by move 60. So it looks to me as AlphaGo is very good at openings. Perhaps that’s because humans are even worse at judging positions.

SanDiego · October 20, 2017, 4:03pm

AFAIK the DeepMind team didn’t claim that for AlphaGo self-taught was more efficient than relying on prior experience. There were other changes in the latest version that can explain the faster learning curve.

The main claim is that no human input is needed to reach the top level.

SanDiego · October 20, 2017, 4:37pm

When I think about it… it doesn’t make sense.

As @S_Alexander said, the usual pattern for AlphaGo Master against humans is to take an early lead and then cruise to victory. And its 60+ game winning streak shows that its level is way above humans.

I am afraid pro players are not following your advice. They have studied the new openings and reused them in official matches.

Musash1 · October 20, 2017, 6:40pm

I think a parallel (to some extent at least) can be made with, for example, learning mathematics. Is it a more efficient way to improve by working on this with someone at your level of mathematical proficiency/knowledge or to study books by, and to listen to lectures by professors of mathematics? I think that you will quickly conclude that the work with a mathematical “peer” is by far the less efficient method.

Regards,

– Musash1

BHydden · October 20, 2017, 9:53pm

I think this is a highly arrogant position that humans have come to conclude. I think if anything, we’re stuck in our notion of ‘equal local exchange’ meanwhile alpha go simply finds the best move on the board regardless of whether it is joseki.

BHydden · October 20, 2017, 9:55pm

No they didn’t “claim” it, that’s why I thought it would be an interesting discussion. Especially since zero got to Lee sedols level from just 3 days of self play.

BHydden · October 20, 2017, 9:59pm

That is an interesting point but keep in mind that unlike maths and chess which are purely logical thought, go is arguably equal parts pattern recognition. Which is very different and worth consideration.

Musash1 · October 21, 2017, 5:42am

But let’s not forget that neural networks, such as those used by alphago and the ones that I have researched and published articles concerning their function, do all of this using principles of mathematics and statistics. The “patterns” are essentially reduced to mathematical functions in n-dimensional spaces and things such as minimizing distances between points, maximizing area functions, etc.

And where are alphago’s eyes? Right! Alphago is blind and plays using the coordinates of the moves (i.e. the points in the game-space). Humans operate with patterns in completely different ways than neural networks.

So a human learning these patterns learns them much more efficiently when learning through master games rather than by playing with peers (where bad patterns abound and good patterns are often rare birds until the humans have progressed far enough to recognize that they are good).

– Musash1