Interested in measuring your Go Strength?

I recently started a project go-strength-analyser, which aim to measure player go strength, by using AI engine to analysis the game they played.

Some help I needed:

Idea on improving the formula
Analysis more games

Any ideas, thoughts or feedback?

7 Likes

Is there a way you can scrape all ranked 19x19 OGS games of a player, rather than asking the player to enter their sgf one by one?

2 Likes

Yes, thx for your suggestion, I updated the issue. Actually I need to find player with different ranks.
Then I can filter the games needed.

2 Likes

You have some negative scores in your data. Does that mean these players on average played moves worse than passing?

Also note that games from go servers are usually public.

Most notably, the first version of AlphaGo was trained using games from the KGS archive. Deepmind did not contact each player personally to ask whether they were okay with their games being used this way.

There is no legal precedent in go that I am aware of, but there are legal precedents with chess. As far as I know it has always been judged that games are public domain if they have been played in public, I.e., during a public tournament or on an internet server with a public database.

So, you could just use the OGS database of games directly.

2 Likes

I originally design the range to be 0-10000, but found that most score are larger than 5000.
So I change it ranging from -10000 to 10000.
In addition even if you play worse than pass / better than ai, the score is still between -10000 to 10000

Actually my difficulties are:

  1. find player with different ranks.
  2. speedup analysis (I can only do around 10 games every)

As you mentioned OGS database, is there any document on how to access / query? So I can find player by rank?

Thanks for clarifying. This gave me an idea: I’ve heard people claiming that 30 kyus basically play randomly. Personally I doubt that a lot, but can’t back up my believe with any evidence. Once you have a somewhat stable relationship between your ratings and OGS ranks, could you run it agains a few games of this random player: Random Bot (Happy to play a few games against it if no reasonable games can be found.)

1 Like

Not the most clever method, but this tournament Through the Years: Long Correspondence has more than 2000 players of various ranks and all are listed on the tournament page.

1 Like

Interesting idea to see what Random Bot’s game will score.

Nice idea!

I hope an AI finds this easier than random human Go players do, since we have enough evidence that humans are rather bad at guessing the rank of players:

2 Likes

I think a few reasons why it won’t be very precise can be found quickly. I have these:

  1. Games with heavy fighting will give worse scores than calm ones, because in heavy fighting huge mistakes are common, while in calm games it’s hard to make mistakes greater than 10 or 12 points. Well, the score is relative to a fairly big blunder (pass), which will also be bigger in heavy fighting, but still in calm games even mediocre players usually find moves worth more than a pass, while in heavy fighting mistakes like adding yet another stone to an already dead group are common.

  2. One will probabyl get better scores against stronger players. Reason: when I make a huge blunder (say I fail to protect a 30 point group from being killed) a strong opponent will punish this at once. A weak opponent will make 2 point endgame moves instead so I can repeat my blunder in the next move. And again. And again. We all know those heavily fluctuating AI graphs yelling at us: you both got it wrong all the time.

I still like the idea of having such a score and relating it to ranks. Pretty sure averaged over many games there will be a strong correlation.

2 Likes

Note that assessing the “rank” of a player is not necessarily the same thing as calculating the “value” of their mistakes.

Neural networks are good at putting objects into classes. Here, the classes are ranks of human players. “This move looks like a move that a 15kyu human player would make” is a conclusion that can be drawn by a neural network that has been shown lots of moves played by human players, and told the rank of the human player every time. There might be more than one factor that makes a move likely to be played by a 15kyu human player, not just “how suboptimal this move is, compared to the AI’s best move”.

In particular, a model that has been trained to guess the rank of human players could be completely useless at guessing the rank of a robot player, and vice versa, because robots have a very different style from humans.

1 Like

But the OP is not building a classifier, just using katago for finding the value of the mistakes.

Oh. Well then please disregard my previous message.

1 Like

You can download 27 million games in one shot. Look in the forum for directions.
From those games you could get not only full SGF but also OGS rank for both players.

3 Likes

Probably not accurate, black GSS is 8115.76(close to the score for 1d,3d,6d) and white GSS is 6267.26(close to 5k-7k)

But if the purpose is to measure go strength overall, it seems like a classifier as @ArsenLapin1 describes would work better than finding the value of mistakes.

Definitely.

Actually what I’m trying to achieve is proposing an absolute measurement of Go strength, by how each move perform between worst(pass) vs best(ai move). I compare the score with the rank because I wanted to see how my implementation perform. e.g I suppose higher rank should get higher score.

Of course implementing good measurement is not easy, but will be useful if we got one. Some application I can think of:

  • Comparing ranking between different system, like OGS vs Tygem.
  • Supplement on ranking system, e.g. detect people using fake account to play dummy game to bump the rank.
  • Comparing the strength of famous go player in history, or generally compare two players go strength without actually playing a game.
  • Measure how time setting affect player performance. We believe player play worse with less time, but how much?
2 Likes