Deepmind takes on StarCraft


#1

Livestreaming now:


#2

Oh look, a bot that micros perfectly. Shocking. :slight_smile:

But yea, who expected anything else?


#3

They limit its actions per minute, so it can’t exactly do what happens in your link, but I guess it still manages to click exactly where it wants, when it wants. Would be interesting to introduce some uncertainty there, make it click less perfectly or introduce some uncertainty in when the click registers.

You force it to play even more strategic that way.


#4

In other news, Artosis has actually become the pylon.

New challenge to the Deepmind team: AlphaStar Zero. No specialized inputs other than the pixels on the screen (which it is still required to move around on its own). :sunglasses:


#5

Let’s also give it complete control over mouse and keyboard output :slight_smile:

And then film the screen through another camera. You know what, let’s put that camera on a slightly moving track. How about giving it 200ms lag?

Humans are pretty amazing, still…


#6

Input in this case means “data to be processed”. Mouse and keyboard do not provide data to be processed.


#7

there, fixed it :slight_smile:


#8

Well, the bot already issues commands. Human actions need to be digitized in order to establish communication with a virtual system. The bot is already a virtual system in its own right, so as long as it interacts with the target system, there’s no issue here. However, we can still impose limitations on that communication, of course, as was done by limiting the APM, requiring it to use the camera etc.

It makes sense to tune those limits according to what the bot is supposed to accomplish. If it’s merely “superhuman speed”, that is rather trivial (see automaton2000). If it’s decision making, it seems sensible to restrict the bot’s “perception” and “mechanics” to, say, the upper limit of human capabilities.


#9

Especially since those kind of limitations are part of non-virtual things, so if we ever want to have a robot butler, we need these kind of things.

Robot butlers of course being the ultimate goal of AI.


#10

Myea but that stuff is Boston Dynamics’ métier.


#11

Well, I don’t really know anything about high-level StarCraft play, so it’s not really my place to judge.

Nevertheless, despite whatever inherent advantages were given to bot, it seems that the pros and commentators were quite amazed by the results.

Maybe a go forum isn’t really the best place for this. There seems to be better informed discussion in these reddit threads:


#12

As long as these butlers are also willing to have sex, that is…


#13

The approach, and play style, of the SCII AI reminds me a lot of the Chess/Go AIs. Namely, it doesn’t rely on “traditional” computer advantages, such as absurd APM, but instead has more of a reliance on good whole-map awareness, a good balance between micro and macro, and never letting up in terms of pressure and activity on any front. Right now, it seems like they’re at the level of “AlphaGo Master”, but for Starcraft, and I look forward to seeing where they can go, from there. I have no doubt, in a couple month’s time, that they’ll be at or past the level of top pros, if limited to PvP and a single map. What I’m looking forward to, rather than to an AI that’s hobbled in some way, is the development of an AI that can play a random race on an unfamiliar map, and still play at a pro level.

The most interesting thing for me, though, was how they selected and propagated their top networks, in that they trained different NN lineages to focus on different strategies, and then pooled them against one another to learn from each other, without actually having crossover in terms of weights. That’s something I’ve wondered about for a while with the various Leela projects, namely LeelaZero and LCZ. Rather than having a single “best” bot, and basing the next generation off of variations on that bot, why not have bots emphasizing different play styles? The naive, human-based way to do this would to explicitly have tracks that favor territory, fighting, and influence, for example, as well as an AI vanilla one with no such preferences. The more interesting way might be to select for AI whose positional evaluations differ by as much as possible, while still maintaining high win rates against the entire AI population, in order to drive a wider “front” of exploration across the neural net design space. Different populations could cross over, split, and go extinct, so long as the front remained broad and the individual populations grew in strength.


#14

In fifty years’ time, when the next generation ask me what the crowning achievement of computing was before the robots seceded from world society to form their own republic on the Moon, I’m going tell them: we made a purpose-built AI to control a horde of virtual raptor-aliens in order to assail equally virtual lines of defensive tanks. Just look at them dispersing and regrouping, dodging the explosions with speed and grace. It’s beautiful.


#15

The APM argument is relatively bs imo.

For AlphaStar, 200 APM is 200 actual actions per minute, while for human players, if you look at their mouse clicks, probably 99% of that 500 APM are mindlessly left clicking empty ground or spamming 12121212.

But still, it’s astounding how fast AI evolve these days. Probably marrying a real woman was a hasty mistake after all (please don’t tell my wife)


#16

That’s why the eAPM metric was developed.


#17

I don’t think we know this, yet. For all we know, the AlphaStar nets could be generating 30% random action noise.


#18

That would kind of be like AlphaGo placing random stones 30% of the time… I don’t believe that after 200 human years of playing starcraft, the computer is wasting its actions at times where it could take advantage of using them.