Strength of OGS players

In the past we had unique ranks for all combinations of board size and time control, but we got complaints about that too. Being strong in one and then playing in a different category was perceived as a form of sandbagging.

2 Likes

The OP’s question is directly related to OGS ranks and ranks being weird is the explanation. I don’t see how it is derailing from the topic.

Checking OP’s games is not enough to see why he had hard games - I dont think it’s been claimed him playing bots is the reason for his rank. As I mentioned, it is him running against certain types of opponents (people who play bots, with curvy graphs, blitz throwers -or even people who play those people). There is just a big gap between the strength of blitzers and corr people who dont use support. A 4kyu 3 sec blitzer roughly equals to a 10kyu who plays mostly corr and doesn’t use support. That is not providing even matchups, not consistently anyway. And if you don’t know how to avoid the smooth sandbagging crowd, you will be lower than your actual rank.

OGS allows for cowards to lower their rank and proud people to airbag, as easily as they can. Sure, joining them if you can’t beat them is an easy option but I personally like that some of the many people, who notice the weirdness in ranks, are speaking out.

Well, look at OP’s opponents, tell us what you think about them.

Haha, you’re giving me hope here.

The root cause of the difference in ranks for the same player on different servers is due to the result-based ranking system. Flatten ranks occur when you are playing on the server where there are a lot of stronger opponents. This problem can be solved by using a new ranking system called ‘performance’ (%). You performance is 100% if you were as strong as AlphaGo Zero (AGZ). It is 50% if your performance were a half of AGZ’s performance. By this move quality-based system, you can compare the performance of Go players across the countries or tournaments, or throughout the history of Go.

1 Like

Who is this AlphaGo Zero and why can he still play when torn in half?


More seriously, AlphaGo Zero was never released, so we cannot let users play against it. Also how do I determine the relative performance?

4 Likes

SPA requires a paradigm shift in Go ranking. Although AlphaGo Zero (AGZ) has never been released, the game records of AGZ are widely available. Using SPA, you can assess the quality of the moves played by AGZ and calculate the AGZ’s performance (p-value) based on the strength of AGZ’s moves. Then, take the p-value for AGZ as 100%; and determine your performance in the best game you played. SPA refers to ‘Superhuman Performance Analysis’, a new method for Go ranking and determining a person who is actually the strongest player in the history of Go (He is Cho Chikun. Neither Shusaku nor Dosaku is the strongest human player in the history of Go)

4 Questions

  1. is there a paper about SPA
  2. if I understand you right, only 1 of my games is used to calculate my strength
  3. if I understand you right, this is a test about how similar my playing style is to AlphaGo Zero
  4. do you mean this p-value? https://en.m.wikipedia.org/wiki/P-value
4 Likes
  1. There is a book, not a paper.
  2. One best game is more than enough, because SPA deals with the moves in the peak game, not the game results. If you want to access the current performance, use the last game you played.
  3. No. Not at all. It is a test about how strong your moves are. You can play your style and get the same performance as AGZ if your moves and AGZ’s moves are equally strong. You can get greater performance value if you played differently but better than AGZ.
  4. No. I mean p-value in a new domain, not in statistical one. Here p stands for performance (%), not probability (Dmnl).

I’m not convinced.

The best™ game is an statistical outlier and not representative for the players strength. Additional I don’t see how using the moves of one game, but not comparing them with a reference (AGZ) leads to any estimation.

As an additional thought, we want to compare the strength ofdifferent AIs as well. What happens with the performance value when we feed the algorithm with a game of a player stronger than the reference player (AGZ).

4 Likes

I don’t want to convince you either. You need to understand the fundamental philosophy of the SPA before being able to think about it. I gained many insights from SPA. For example, Shusaku Fuseki is almost perfect and not yet perfect. Controlled sample games had plausible p-values. For example, the p-value for AGZ is a little greater than the p-value for AGM and far greater than the p-value for Michael Redmond.

I’m struggling with how to perform superhuman performance analysis. In your self-authored 37-page book, superhuman performance analysis "can be conducted in five steps:

  1. Top Professionals and Benchmarks Selection
  2. Performance Analysis
  3. Strongest Go Players in History Listing, and
  4. Multiple Tests for Building Confidence." (these are four steps, but I digress)

Since choosing the players (step 1), listing the results (step 3), and repeating the tests (step 4) are trivial, I’m interested in step 2. But the free sample of the book ends one page before you describe the process.

Your book summary is more helpful: “Two simple equations of Go skill, in terms of performance (%) and move strength, were created and the well-validated superhuman Go bot was employed to assess the top professionals’ performance.” Have you got any more details than this? What superhuman Go bot do you use? How many visits do you give it? How do you calculate the performance and move strength variables? Why is AlphaGo Zero the benchmark and not the superhuman Go bot itself? Without seeing the details that I assume are on page 11 (and only page 11) of your book, your analysis appears vulnerable to @flovo’s criticism that this is a test about how similar the players’ playing styles are to the superhuman Go bot.

P.S.: The ELF OpenGo team also used AI to rate professional Go players. But they were much more cautious in their conclusions and did not make sweeping conclusions like “Player 1 is stronger than Player 2.”

P.S.S.: You know what, nevermind. I see that you’ve decided to conceal your process. But unless the process is known, it’s of no use as a ranking system.

11 Likes

I will just pop in to say that the people on Sensei’s Library have a thread going right now about this, and they are very sceptical. Also, thank you Mark for taking a sharp knife to this.

4 Likes

It’s a shame that the go community has to waste time being distracted by this ridiculous pseudoscientific nonsense. Perhaps this is all just some elaborate trolling?

Besides, I think there are far better ways to measure the relative strengths between different servers and communities of players:

  1. Thorough and plentiful polling.
  2. Quality and volume of memes generated.
  3. Proliferation of games involving werewolves, vampires and raptors.
  4. The general degree of tomfoolery in their public discourse.
13 Likes

Read the full book and you will have a clear understanding of SPA. Note that all steps of SPA are important and interesting.

Step 1 allows you have a set of strong (sample) players and reference (control) players adequate for answering our research questions (Who is the strongest human player in the history of Go? and more).

Step 3 shows the results which answer the research questions. This step makes it clear who is stronger than who and serves as a basis for the next round of analysis.

Step 4 is not repeating the tests. Multiple tests are different tests that, if passed, the results in Step 3 will be scientifically plausible. This is a good practice in scientific analysis before declaring the name of the player who is the strongest player in the history of Go.

Step 2 is the key intellectual property and copyright material, so the free sample is good to exclude it. Any more details deserve your payment. It is not easy to come up with this step. The superhuman bot name is clearly mentioned in the book. It is not vulnerable to any criticism or skeptical because the answers to your questions are in the book. Again, SPA is not a test of human-bot similarity in style of play. It is the test of move strength. It can detect your divine moves and then gives you higher scores if your moves are stronger.

Note also that skeptical comments in a new scientific method is a common event. When Prof. Jay W. Forrester of MIT published his work in his book “World Dynamics,” many traditional scientists are skeptical and considered his work pseudoscience, because they shared the different mindsets of modeling. Who is right? It depends partly on the purpose of your analysis and the fundamental philosophy of your methodology.

You can try other methods and see if the results are similar or not. It is clear to me that using the list of top international title holders is a misleading method. Using dan ranking and Elo rating are also inadequate and have limited applications.

If I can give you one tip of how to appear believable, it is to actually explain your methods instead of your results. Hiding it behind intellectual property and copyright is the same as a magician asking you to close your eyes during his trick.

But it’s your loss, mainly. This could have the possibilities of being an interesting discussion, if you wouldn’t shut it down.

7 Likes

I’m guessing SPA is this:

You take a game and the AI evaluates each move against the AI’s best move for that board position. Your move will result in a win rate percentage gain or loss (usually a loss) according to the AI. Add up your moves and divide by the number of moves to get your average performance per move, and that would be your score.

One problem might be that if you play a game with a lot of easy end game moves, it would increase your score.

Did I get it? :smiley:

2 Likes

Buy the book to find out!
:grinning:

3 Likes

This would be my guess as well until further evidence shows it isn’t. Further evidence for which I don’t have to pay 30 dollars on a site that is built (for free) with Wix, that is.

2 Likes

The original SPA can address the endgame problem you mentioned, which is a negligible issue. SPA as noted in the book is not sensitive to the easy endgame moves.

Results surprised me a lot about the strength of the top professional players and can solve some historical problems such as who is stronger between Dosaku and Shusaku; Shusaku and his teacher; Cho Hunhyun and Lee Changho, etc.

@9x9Meijin, suppose I patent a new herbal oil that I say cures viral infections. I tell everyone that it cures HIV, influenza, coronaviruses, and so on, and they ask me for proof. I tell them that it’s my intellectual property and copyright material, and any more information deserves their payment. Instead, I go on about how effective my oil is at fighting various infectious diseases. People will remain skeptical, and I’ll dismiss it as expected in the face of such groundbreaking science. But then the Bureau of Chemistry will find my oil to be drastically overpriced and of limited value, and I’ll face criminal charges for fraud.

So too here. If you hide your details behind a paywall, nobody will find your results credible or interesting.

(Sidebar: I’d be shocked if Shin Jinseo wasn’t the strongest player of all time.)

6 Likes