Alpha Go Zero team; a challenge to finish what you started

I just woke up and saw the bot timed out. It was still thinking, with over a million playouts! Some sort of unusual race conditions caused it to get stuck in a loop. But other than that, it played all the other games successfully overnight. :slight_smile:

1 Like

Links to the games?

http://gokgs.com/gameArchives.jsp?user=YourRank&oldAccounts=y

The ā€œYourRankā€ adaptive playing bot is a cool idea, if you have it on OGS Iā€™d love to play it.

I proposed a different twist, a bot that does max punishment and bascially chases the weaker user all across the board, it dynamically detects the strength of the user, the weaker the user is assessed to be, the more bolder and outrageous moves the bot plays, with the idea that for a sufficently weak user, the bot will leave it with almost nothing on the board by the end of the gameā€¦

It feels like this would demoralize the weaker player. Iā€™d rather have something that allowed the weaker player to make eyes, in fact force them to by play around space. See some of the DDK games that Dwyrn plays, where he is kind to his opponent and forces them to play good moves by manipulation and peeping.

2 Likes

Your idea of playing forcing moves to indirectly teach is good for learning. the overplay bot is basicallyy a usecase of showcasing in visceral way the power of AI to a beginning casual player who wont be able to appreciate the true strength of a top bot anyway.

1 Like

Is the bot is calculating a 50% win-rate assuming very strong follow-up play from the opponent? If so, wonā€™t its moves still be consistently very strong against many players, since the bot is overestimating the strength of the responses?

1 Like

Iā€™m having trouble with the ā€œI know you want to save human lives and all, but I challenge you to finish what you started: entertaining me!ā€ Thereā€™s an insensitivity (human lives are worth less than entertainment) and a hubris (your goal in all of this was to entertain me) implicit that are hard for me to see past.

You say, ā€œMake it the most fun computer program to play for any human player. Then you will have done something useful for all humans.ā€ Like all humans are Go players. Like saving human lives would NOT be useful for all humans.

Iā€™m cringing imagining the AlphaGo folks reading this.

4 Likes

To be fair, the OP may have been slightly tongue in cheek

As for Deepmind, they are by no means above reproach:

https://groups.google.com/forum/#!topic/fishcooking/ExSnY8xy7sY

4 Likes

Thatā€™s an interesting perspective. Iā€™ll expand a bit further before letting you finish your thoughts.

First, realize that they started their AI experimentation with a game and they will stop when they stop. Theoretically what they learned from the first version that beat Lee Sedol was probably enough. They could have stopped there, but they refined and continued. Why? There was more to explore and discover by optimizing the game client. They create not one more iteration but two more. Each time the AI was well above even expert level players, such a small percentage of people were served by optimizing those last two iterations, the AI was basically only trying to beat itself.

That means they could have jumped away from Go at any point after the first set of games and started looking at other applications. How many street accidents avoided or cancers will be misdiagnosed because the software AI wasnā€™t ready for weeks sooner is a moral question I donā€™t think can be answered. However, it was clear that they really wanted to finish with Go. They did not want to leave it hanging before they moved on. This is called follow through and is very important.

Knowing when you are actually doing a follow through correctly and when you are wasting time is also important.

Which brings me to the second point. I donā€™t think they finished. They created a wonderful AI that can play and defeat any human opponent, but the world doesnā€™t need that. The world needs help. People need help. And they have not, IMHO, finished what they started until they have created an AI that can help humans. Not beat humans. Humans are already scared enough that AI are going to eventually take over everything. Leaving Alpha Zero as a program that simply beats all humans is not a service to humans, itā€™s a service to the creators of Alpha Zero who got a lot of publicity from it.

That brings us to the third and final point, which is that to finish with the game of Go, the AI needs to be able to be played as if it were a teacher or friend to humans. Why? Because the ultimate goal of AI is to help human society. Every AI should have that as its starting concept. If itā€™s not aiding humans, then you have a moral dilemmaā€¦ an automation that serves itself or worse only its creators.

This is where Iā€™m admittedly asserting that the final step for Alpha Go Zero is to become something any human could use. A tool that benefits society as we slip into the next stage of human evolution and societal development. A role model for how AI is created. AI can be a tool for amazing destruction and perhaps that will happen and create a dystopia. Or it may end up being something amazing and help bring about a human utopia. Itā€™s my strongest desire to see a ā€œhumans firstā€ mentality and that this AI be finished to the point where someone who has an interest in Go can play it and it will play with them at their level.

I hope that makes me views more clear.

3 Likes

Gotta know when to tenuki.

3 Likes

You are probably right about the teaching humans part: that should definitely be a long term goal of AI.

However, I believe one of the main goals of Deepmind (correct me if Iā€™m wrong) is an AI that learns to diagnose and treat medical patients with more accuracy than any human doctor.

Go served as a testing ground for machines that are much more complex than Chess, teaching a machine to decide from an incredibly large output space, where each action has long lasting consequences, which output to pick. After AlphaGo Lee, Deepmind decided that it could still be used to test more efficient and versatile algorithms like AlphaGo Master and AlphaZero. Especially AlphaZero, which learned three different games from scratch.

After Go, they have been testing with mazes and I believe they announced that they will be heading into StarCraft, which has not only an incredibly large output space, but is also a strategy game that requires one to control many units in disparate ways, but also a massive input space and introduces a ā€œfog of warā€ where you have to guess at what the opponent is doing, creating a dynamic that Deepmindā€™s AIs have yet to experience and is essential to functionality in real life: dealing with unknown and dynamic values.

Iā€™m short, teaching is a must, but Deepmind seems to have other and equally important goals.

3 Likes

Yes, I think youā€™ve summed up the situation quite well from the available information we have.

StarCraft would be an incredible feat, particularly because of the hidden information, what you referred to as the ā€œfog of warā€. Carnegie Mellon had a tournament with no limit, heads up Texas-Hold Em. This was an incredible challenge and the AI did remarkably well, adapting to the various strategies the humans attempted. Here again we see the recurring theme of man versus machineā€¦

My suggestion, my plea, is that we change the narrative. Itā€™s incredible to create a game that can beat professional poker players, where full knowledge is not possible and the AI adapts to each players style of play to win. The techniques learned and tested from this experiment pushed the boundaries again. But it brought the idea of computer aided cheating to no-limit poker. In the end we donā€™t want computers used to hurt society. We donā€™t want AI that humans use to cheat. The narrative needs to by one of hope and prosperity. What good is an AI that can beat every human in the world at Go? That storyā€™s narrative has an ending that is negative for society.

We, as humans, need to reframe the narrative and create things that increase the happiness and reduce despair in society. If people feel the game of Go is pointless because thereā€™s an AI that can always win, that is not a good thing. However one more step and you can create an AI that finishes what was started and move the narrative to one where AI are the teachers of Go, perhaps shepards helping we humans who understand it less well to play enjoyable games.

In short, I donā€™t think the story should end here, because this story is going to lead to dystopian stories. We deserve better. The last chapter of this story is not written yet.

3 Likes

I agree. This absolutely needs to happen.

But it wonā€™t be Deepmind that does it.

Their focus is, and should be, on creating AIs that are more skilled than humans in saving lives.

It is the job of the rest of us to pick up in the teaching of humans to improve.

1 Like

Letā€™s agree to disagree on this point.

I donā€™t think they finished and they should finish because it will help them later when dealing with other moral dilemmas that come up with life and death questions.

Well they announced a ā€œteaching toolā€ and all we got was this half-assed set of opening moves.

Deepmind did not deliver.

3 Likes

Iā€™ll agree to disagree, but I do think itā€™s a bit odd that you favor the continuance of a research project over the intended focus on things such as medical advances.

Anyway, I believe that the community creating teaching AIs (like YourRank) to teach is a wonderful thing. Or maybe if a business feels it can make money from making an AI to help people get better at Go (or teach people to get better at teaching people to get better at Go), that might be nice, too.

Iā€™m the end, as long as what needs to get done gets done, I donā€™t care who does it.

Not only that smurph, recall Deepmind initially stated the Teaching Tool would accompany partnership with Ke Jie as the first in a series of using the Tool to do analysis. Unless I missed something, I donā€™t think they made good on that part either. The whole ā€œwe have a series of gifts for the Go communityā€ was basically 50 self play games that they wanted to slow drip on us by stating at first that only a set of 10 would be releases each day for five daysā€¦ And deepmind CEO stating that the match with Ke Jie was basically an exploration of the deeper truths of Go, man and machine, (the future of Go together etc) etc etc etc and that the reason for holding to match was to see was to see and know for sure how strong AG got, was that they needed to play the strongest humans (Ke Jie) in order to know that, because even they had internal tests, they still didnā€™t know for sure against a real top pro etcā€¦ to me all that rhetoric was hogwash. They already knew for sure, since it was like a 1000 Elo above Ke Jie, and in addition, that was also what the ā€œmagic and master 60 games seriesā€ was all about, and plus, if they didnā€™t know ā€œfor sureā€, how in the world would a THREE game match (and then AG ā€˜retiresā€™ never to be seen again) be statistically meaningful enough to settle all doubt or be some sort of scientific benchmark anyway?

2 Likes

Itā€™s PR, plain and simple (maybe trying to attract donors and investors?)

I think deepmind should just release the network weights. Then the community can build great tools around it for teaching and learning. The community has now gone ā€œwell f**$ y$% then, weā€™ll just make it ourselvesā€ with Leela Zero, which is great and quite cool, but should be unnecessary. Or if there is some really super-secret juice that makes it impossible, they could help the Leela Zero team with some tips and maybe some hours in the cloud to speed up the training.

For us mortal kyus and dans, it doesnā€™t really matter that there is an AI that is stronger than us. So what? There are a ton of professionals (and amateurs) that are stronger than us anyway. For the pros, I think it is super exiting. They love go and always strive for a deeper understanding of go. AIs are a new tool that can help with this and bring new ideas and insight into the go world.

1 Like