Which programming language should I use for developing an AI that plays Go?

Zuko_1 · May 26, 2021, 7:50pm

Hello, I love to play Go and I’m interested in developing an AI program that plays the game. I’m of course aware that there are already such programs, such as AlphaGo and AlphGo Zero, but I want to experience building it from scratch. I want to develop a program like AlphaGo Zero in the sense that it will be self-taught and won’t rely on learning from games played by humans.

The thing is, I have zero skills in programming. I want to start learning it, but I don’t even know which languages are best suited for this task. I’ve read that Python is generally better than Java for machine learning, but that’s all I know.

Any kind of help will be appreciated. Thanks

Edit: Wow, thanks a lot for all the comprehensive and professional answers! These are plenty of advices and resources, and I’ll start checking them immediately.

Vsotvep · May 26, 2021, 8:00pm

Python is generally the language that’s used because it has a lot of great libraries for machine learning, and it’s definitely more friendly for beginners than Java.

I once followed a couple of machine learning tutorials on YouTube about creating a handwriting recogniser, which is basically the standard ‘first project’ in machine learning. I’d recommend starting with something like that before creating a Go AI, since I think handwriting is a bit more forgiving.

okonomichiyaki · May 26, 2021, 8:01pm

This is my recommendation for a beginner. Partly because this book uses Python which will guide you through exactly what you want

In fact they use a hand writing example as @Vsotvep suggests too

Vsotvep · May 26, 2021, 8:09pm

Here’s the guy that made the tutorials I watched, it seems he made a new series that’s a bit more up to date:

Here’s the old series that I followed:

And he has a very in-depth series (perhaps a bit too much in-depth, don’t recommend this unless you’re very interested in machine learning) that I was planning to follow before I decided to continue with studying maths instead:

square_fuseki · May 26, 2021, 8:21pm

beginning to learn programming with creating AlphaGo is like beginning to play Go against AlphaGo

Vsotvep · May 26, 2021, 8:26pm

Well, I’m quite convinced you can get pretty deep into machine learning without really being that good at programming. I’ve seen enough examples under the AI students at my uni

okonomichiyaki · May 26, 2021, 8:39pm

on the other hand it’s good to have an interesting project to motivate you! more interesting than Yet Another Web App To Share Photos or whatever

bugcat · May 26, 2021, 9:10pm

I know Python but I’d hate to program anything serious in it, because f*** semantic whitespace.

( see also My favourite (programming) language is clearly the best and all others are horrible because )

For that reason, I’d recommend at least checking out other scripting languages like Ruby and Perl before really committing to Python.

Python also has some odd (an uncharitable commentator might say “awkward” and “unintuitive”) features, especially in the way that it treats looping and arrays.

yebellz · May 26, 2021, 9:24pm

I’ll write a longer post later, but I just want to say that this is really bad advice:

The clearly stated objective and motivation of the original poster, @Zuko_1, is to get into machine learning, starting from scratch with zero programming experience.

Starting with Python and checking out PyTorch tutorials is a good idea. Looking at Ruby and Perl would largely be an unproductive tangent. Even if the argument is to get a broader perspective on different languages, I would rather recommend looking at some C/C++, in parallel with learning Python.

bugcat · May 26, 2021, 9:27pm

Starting with Python … is a good idea.

You’re welcome to state your opinion, but – for the reasons I stated (semantic whitespace, which I consider a negative feature; and strange treatment of loops and arrays) – it is not mine.

I do not recommend starting with Python, but that’s also only my own opinion.

okonomichiyaki · May 26, 2021, 9:43pm

Not looking to get into a programming language debate here but I think yebellz point about libraries should be stressed. (probably yebellz will elaborate)

When I started reading the Deep learning and the game of Go book a while back I tried to start messing around using JVM languages instead as I’m more familiar with that platform. But I quickly realized that machine learning is not like other programming tasks, and the libraries do an enormous amount of heavy lifting (compared to databases for instance, you can basically plug any language into most databases without much trouble) The ML community seems to use Python more, so the experience will be much easier there. Regardless of significant whitespace, the more challenging thing is understanding the machine learning techniques…

bugcat · May 26, 2021, 9:46pm

The ML community seems to use Python more

Not ML? ^^

okonomichiyaki · May 26, 2021, 9:55pm

Hah in fact I spent many years dabbling with Standard ML. But I think you’d have even a harder time reimplementing AlphaGo with one of those… although who knows, doesn’t the author of KataGo work for Jane Street (heavy user of Ocaml)?

shinuito · May 26, 2021, 10:20pm

I remember that deepmind also had a GitHub repository for something called OpenSpiel which seems to be a bunch of code for different games. I’ve never looking into it properly but there’s probably some things just for getting the different board games working.

Vsotvep · May 26, 2021, 11:04pm

If it’s about learning programming, fair point.

If it’s about learning machine learning, Python is without a doubt the way to go. Especially if you have no programming experience.

square_fuseki · May 26, 2021, 11:16pm

I get used to C and Java looks similar to C.
But Python is alien, when I tried to create go text protocol interface with it, I understood nothing and finally stopped.

yebellz · May 27, 2021, 1:34am

To answer the title of the thread (“Which programming language should I use for developing an AI that plays Go?”) succinctly, given that you currently have “zero skills in programming”, I highly recommend:

Start with learning Python.
At some point later (or possibly in parallel with learning Python), learn some C++.

But, now to say a few more words…

Python is the essential language for studying and applying machine learning these days. It has the best support for libraries and tools, and crucially, so much open-source AI work is done in Python. Thus, it is essential for learning from public code examples.

C++ is also very useful, since it is often the language of choice for applications (like Go AI) with demanding computational performance requirements. The strong, open-source Go AIs mainly use C++, with only some parts in Python.

Of course, in addition to learning how to program, one needs to learn how to use a machine learning library. For starting from scratch and just general usefulness for applying deep learning, I highly recommend PyTorch, since it has become the dominant deep learning platform. There are various tutorials online (such as the official ones here) as well as various YouTube videos and public code examples on GitHub.

Another useful platform is TensorFlow, which was dominant a few years ago, before PyTorch took over. While it is no longer the most popular choice for new projects, I mention it since the LeelaZero AI uses this platform, and there are lots of other older ML projects using it. However, I would prioritize learning PyTorch instead of TensorFlow.

Building a superhuman (or at least very strong) Go AI from scratch really entails two steps:

Implementing the software to train and execute the Go AI (this is where you learn to program and write the code).
Training the Go engine (this is where the program from step 1 teaches itself to play Go).

Step 1 is feasible for a self-taught individual working alone. There are a lot of resources to learn how to program and implement Go AI. The fundamental algorithms and concepts behind the modern superhuman AI are not too complicated, and the whole idea is to avoid having to manually program in Go knowledge, but the tricky part lies in implementing things to operate efficiently and at the requisite computational scale.

Step 2 is where things become difficult, not from the conceptual perspective, but purely from the computational angle. Training a superhuman Go AI from scratch simply takes an enormous amount of computational resources. Previous successful efforts required either corporate funding to provide the necessary compute power or leveraged large-scale, distributed computation provided by a community (e.g., Leela Zero).

Other learning resources

For learning about the fundamentals of deep learning and reinforcement learning, both key concepts behind modern superhuman Go AI, here are two free e-books that I highly recommend:

Goodfellow, Bengio, and Courville’s Deep Learning book
Sutton & Barto Book: Reinforcement Learning: An Introduction (even briefly discusses AlphaGo and AlphaGo Zero)

Another book that specifically teaches how to program Go AI was recommended by @okonomichiyaki above, but I can’t say much about that one, since I have only skimmed a small part of it in the preview. It seems that it might be helpful.

It is also possible to learn a lot from other open-source Go AI software:

KataGo: state of the art among the superhuman and open-source Go engines, with a lot of cool features and fundamental advances in improving training efficiency.
Leela Zero: the first open-source Go AI to achieve superhuman performance with a distributed computation effort from the Go community.
Pachi: reasonably strong (amateur dan level) Go AI project that began in the pre-AlphaGo era, originally using just Monte Carlo Tree Search (MCTS), but now incorporates deep learning enhancements.
Michi: a relatively weak bot that does not incorporate any deep learning, but I’m noting it here, since it is a minimalist implementation employing MCTS and entirely written in Python. Looking at this code would be a good learning experience to get some familiarity with Python and the basics of how a Go engine works.

Responses to other people

I think this is definitely true, since there is a significant distinction between using modern machine learning techniques as a tool versus becoming a good programmer. A lot of the work is taken care of by various libraries and tools, and for implementing a lot of basic things, it is often possible to start with some existing example code, mostly follow their pattern, and just modify parts to apply to the specific data and task at hand. Of course, being good at programming does help immensely with accomplishing this, but one does not need to do too much from scratch.

First, it is always great to have such ambitions that motivates one to learn. When someone expresses such an ambition, we should not put them down with such overly discouraging and pessimistic analogies.

Second, it is entirely feasible for one to learn how to program and actually implement something like AlphaGo. The software is not conceptually too difficult. The only really challenging part is finding the computational resources to train it up to superhuman levels starting from scratch.

To be honest, when I first learned this about Python a long time ago, I thought of it as a negative feature as well. However, later on, as I started to use Python more and more, I eventually came to view it as a very positive feature. Using whitespace to delineate blocks removes the need to mess with the clutter of brackets, but what I like most about it, is that it forces one to be organized and tidy with whitespace, and the nesting of your blocks matches their visual formatting, rather than the placement (or potential misplacement) of brackets.

I’m not entirely sure what you mean by this. I actually think that Python has some of the best built-in looping and array semantics, since it natively provides the iterator design pattern. The syntax might seem a bit unfamiliar to those coming from experience with traditional C-style for loops, but I much prefer the Pythonic way of doing things to the traditional styles of looping from other languages. For example,

Python

for item in list_of_things:
    apply_operation(item)

for _ in range(number_of_iterations):
    do_something()

C-like

for(int i = 0; i < length_of_list; i++) {
    apply_operation(list_of_things[i])
}

for(int i = 0; i < number_of_iterations; i++) {
    do_something()
}

MrEntropy · May 27, 2021, 3:14am

Patience Grasshopper! You must learn to walk before you can run.

PDgo · May 29, 2021, 5:26am

OP claimed “zero” programming skills so far. As someone else mentioned, one “has be able to walk before they can run.” In a similar path, I programmed a neural-network to play Backgammon while I was taking a graduate course on AI. I think worked pretty well, all things considered. I took some pride in how quickly it learned to “hit/attack”!

As a stepping stone, consider writing ANY program have a graphical user interface (GUI). Personally, I like Java for this (aren’t we all biased?), but it comes with its own hurdles. Learning an object-oriented language could easily occupy you for several months, and you would still be a relative beginner (which, for one thing, means it takes considerably longer to find your errors!)

As a second lesson, without using any programming language, describe in plain words (i.e. provide an algorithm) for how you would determine the score of a game in its finished state. How do you locate a “dame” point? I’m just trying to illustrate for you what a “world class” problem this is. This problem intimidates me quite a bit, and I have a great deal more experience than the OP.

As a first problem, I suggest 2 player tic-tac-toe (either with all text, or with a GUI), preferably modeled using an array. I think this would be plenty challenging, and there IS a reasonable chance for success. You should do it first, WITHOUT using a neural network, and see how that goes! Then you can add a neural network if you wish after that!

So, I guess to start, one should get a book or two on programming languages, and write a few programs. Learn especially about multi-dimensional arrays (matrices in math). When I was in junior high, I wrote a program in BASIC to play Blackjack. Later in life, I created an object-oriented version, which was a lot fancier, mainly for fun and to give me practice at creating GUIs in Java.

Good Luck & Have Fun!
-from someone with more than 15 years experience teaching programming and software engineering

Valhall · May 29, 2021, 7:54am

Python is very popular because it is a beginner friendly interpreted scripting language. Therefore it is also extremely slow. The interpreter first has to parse the source files. “Beginner friendly” in Python means: “it is easy to copy and paste code from search engines and Q&A sites (such as SO) without understanding what the code does,” which is not how you should go about programming. In order to understand how to write a Python program, you have to know as much programming as you need in order to write a Java or C program. You actually need to understand everything, and when you do, Python is just a useless toy language (which is slow and ineffective because they removed useful syntactic sugar like the postfix and prefix integral operators).

Java is compiled to byte code (.class files) which is then JIT-compiled into machine code, so it will always be faster than Python, however Java has always had a reputation for being slow and bloated. Java runs in a virtual machine which you must install on your computer (or container). Although Java pretends to be “portable” by using byte code, it is actually less portable than C because it is an extremely large task to implement Java on a new platform. Statement such as these made by Oracle are just a pack of enterprise lies / advertisement:

This means that the class files (which are compiled from Java source code) are further compiled at runtime, and they can be turned into very highly optimized machine code. This optimized code runs extremely fast—usually as fast as (and, in certain cases, faster than) compiled C/C++ code.

I would probably write it in C. C is compiled into machine code and you have much control over memory usage and performance. The main difference between C and languages with garbage collection (such as Python and Java) is that in C you have to do memory allocation / deallocation and handle pointers explicitly, which is error prone if you don’t understand what you are doing.

But you need to understand these things anyway. You want this thing to be fast, so write it in C.

C++ is like an extremely complex version of C (C is very simple) which doesn’t really add anything useful to the language.

Writing it in assembly is probably not very fun and will not be portable, so I don’t consider that to be a serious alternative. Newer versions of Fortran may be, though. Fortran has always been faster than everything else (at least F77 and F90, maybe they made the newer versions bloated).

I’d also start with writing a simple program than can read input from the console and write output. Allocate some arrays. Connect to a database, learn some SQL (your AI will need to store data somewhere), I would recommend SQLite or PostgreSQL.