In the past we had unique ranks for all combinations of board size and time control, but we got complaints about that too. Being strong in one and then playing in a different category was perceived as a form of sandbagging.
The OPâs question is directly related to OGS ranks and ranks being weird is the explanation. I donât see how it is derailing from the topic.
Checking OPâs games is not enough to see why he had hard games - I dont think itâs been claimed him playing bots is the reason for his rank. As I mentioned, it is him running against certain types of opponents (people who play bots, with curvy graphs, blitz throwers -or even people who play those people). There is just a big gap between the strength of blitzers and corr people who dont use support. A 4kyu 3 sec blitzer roughly equals to a 10kyu who plays mostly corr and doesnât use support. That is not providing even matchups, not consistently anyway. And if you donât know how to avoid the smooth sandbagging crowd, you will be lower than your actual rank.
OGS allows for cowards to lower their rank and proud people to airbag, as easily as they can. Sure, joining them if you canât beat them is an easy option but I personally like that some of the many people, who notice the weirdness in ranks, are speaking out.
Well, look at OPâs opponents, tell us what you think about them.
Haha, youâre giving me hope here.
The root cause of the difference in ranks for the same player on different servers is due to the result-based ranking system. Flatten ranks occur when you are playing on the server where there are a lot of stronger opponents. This problem can be solved by using a new ranking system called âperformanceâ (%). You performance is 100% if you were as strong as AlphaGo Zero (AGZ). It is 50% if your performance were a half of AGZâs performance. By this move quality-based system, you can compare the performance of Go players across the countries or tournaments, or throughout the history of Go.
Who is this AlphaGo Zero and why can he still play when torn in half?
More seriously, AlphaGo Zero was never released, so we cannot let users play against it. Also how do I determine the relative performance?
SPA requires a paradigm shift in Go ranking. Although AlphaGo Zero (AGZ) has never been released, the game records of AGZ are widely available. Using SPA, you can assess the quality of the moves played by AGZ and calculate the AGZâs performance (p-value) based on the strength of AGZâs moves. Then, take the p-value for AGZ as 100%; and determine your performance in the best game you played. SPA refers to âSuperhuman Performance Analysisâ, a new method for Go ranking and determining a person who is actually the strongest player in the history of Go (He is Cho Chikun. Neither Shusaku nor Dosaku is the strongest human player in the history of Go)
4 Questions
- is there a paper about SPA
- if I understand you right, only 1 of my games is used to calculate my strength
- if I understand you right, this is a test about how similar my playing style is to AlphaGo Zero
- do you mean this p-value? https://en.m.wikipedia.org/wiki/P-value
- There is a book, not a paper.
- One best game is more than enough, because SPA deals with the moves in the peak game, not the game results. If you want to access the current performance, use the last game you played.
- No. Not at all. It is a test about how strong your moves are. You can play your style and get the same performance as AGZ if your moves and AGZâs moves are equally strong. You can get greater performance value if you played differently but better than AGZ.
- No. I mean p-value in a new domain, not in statistical one. Here p stands for performance (%), not probability (Dmnl).
Iâm not convinced.
The best⢠game is an statistical outlier and not representative for the players strength. Additional I donât see how using the moves of one game, but not comparing them with a reference (AGZ) leads to any estimation.
As an additional thought, we want to compare the strength ofdifferent AIs as well. What happens with the performance value when we feed the algorithm with a game of a player stronger than the reference player (AGZ).
I donât want to convince you either. You need to understand the fundamental philosophy of the SPA before being able to think about it. I gained many insights from SPA. For example, Shusaku Fuseki is almost perfect and not yet perfect. Controlled sample games had plausible p-values. For example, the p-value for AGZ is a little greater than the p-value for AGM and far greater than the p-value for Michael Redmond.
Iâm struggling with how to perform superhuman performance analysis. In your self-authored 37-page book, superhuman performance analysis "can be conducted in five steps:
- Top Professionals and Benchmarks Selection
- Performance Analysis
- Strongest Go Players in History Listing, and
- Multiple Tests for Building Confidence." (these are four steps, but I digress)
Since choosing the players (step 1), listing the results (step 3), and repeating the tests (step 4) are trivial, Iâm interested in step 2. But the free sample of the book ends one page before you describe the process.
Your book summary is more helpful: âTwo simple equations of Go skill, in terms of performance (%) and move strength, were created and the well-validated superhuman Go bot was employed to assess the top professionalsâ performance.â Have you got any more details than this? What superhuman Go bot do you use? How many visits do you give it? How do you calculate the performance and move strength variables? Why is AlphaGo Zero the benchmark and not the superhuman Go bot itself? Without seeing the details that I assume are on page 11 (and only page 11) of your book, your analysis appears vulnerable to @flovoâs criticism that this is a test about how similar the playersâ playing styles are to the superhuman Go bot.
P.S.: The ELF OpenGo team also used AI to rate professional Go players. But they were much more cautious in their conclusions and did not make sweeping conclusions like âPlayer 1 is stronger than Player 2.â
P.S.S.: You know what, nevermind. I see that youâve decided to conceal your process. But unless the process is known, itâs of no use as a ranking system.
I will just pop in to say that the people on Senseiâs Library have a thread going right now about this, and they are very sceptical. Also, thank you Mark for taking a sharp knife to this.
Itâs a shame that the go community has to waste time being distracted by this ridiculous pseudoscientific nonsense. Perhaps this is all just some elaborate trolling?
Besides, I think there are far better ways to measure the relative strengths between different servers and communities of players:
- Thorough and plentiful polling.
- Quality and volume of memes generated.
- Proliferation of games involving werewolves, vampires and raptors.
- The general degree of tomfoolery in their public discourse.
Read the full book and you will have a clear understanding of SPA. Note that all steps of SPA are important and interesting.
Step 1 allows you have a set of strong (sample) players and reference (control) players adequate for answering our research questions (Who is the strongest human player in the history of Go? and more).
Step 3 shows the results which answer the research questions. This step makes it clear who is stronger than who and serves as a basis for the next round of analysis.
Step 4 is not repeating the tests. Multiple tests are different tests that, if passed, the results in Step 3 will be scientifically plausible. This is a good practice in scientific analysis before declaring the name of the player who is the strongest player in the history of Go.
Step 2 is the key intellectual property and copyright material, so the free sample is good to exclude it. Any more details deserve your payment. It is not easy to come up with this step. The superhuman bot name is clearly mentioned in the book. It is not vulnerable to any criticism or skeptical because the answers to your questions are in the book. Again, SPA is not a test of human-bot similarity in style of play. It is the test of move strength. It can detect your divine moves and then gives you higher scores if your moves are stronger.
Note also that skeptical comments in a new scientific method is a common event. When Prof. Jay W. Forrester of MIT published his work in his book âWorld Dynamics,â many traditional scientists are skeptical and considered his work pseudoscience, because they shared the different mindsets of modeling. Who is right? It depends partly on the purpose of your analysis and the fundamental philosophy of your methodology.
You can try other methods and see if the results are similar or not. It is clear to me that using the list of top international title holders is a misleading method. Using dan ranking and Elo rating are also inadequate and have limited applications.
If I can give you one tip of how to appear believable, it is to actually explain your methods instead of your results. Hiding it behind intellectual property and copyright is the same as a magician asking you to close your eyes during his trick.
But itâs your loss, mainly. This could have the possibilities of being an interesting discussion, if you wouldnât shut it down.
Iâm guessing SPA is this:
You take a game and the AI evaluates each move against the AIâs best move for that board position. Your move will result in a win rate percentage gain or loss (usually a loss) according to the AI. Add up your moves and divide by the number of moves to get your average performance per move, and that would be your score.
One problem might be that if you play a game with a lot of easy end game moves, it would increase your score.
Did I get it?
Buy the book to find out!
This would be my guess as well until further evidence shows it isnât. Further evidence for which I donât have to pay 30 dollars on a site that is built (for free) with Wix, that is.
The original SPA can address the endgame problem you mentioned, which is a negligible issue. SPA as noted in the book is not sensitive to the easy endgame moves.
Results surprised me a lot about the strength of the top professional players and can solve some historical problems such as who is stronger between Dosaku and Shusaku; Shusaku and his teacher; Cho Hunhyun and Lee Changho, etc.
@9x9Meijin, suppose I patent a new herbal oil that I say cures viral infections. I tell everyone that it cures HIV, influenza, coronaviruses, and so on, and they ask me for proof. I tell them that itâs my intellectual property and copyright material, and any more information deserves their payment. Instead, I go on about how effective my oil is at fighting various infectious diseases. People will remain skeptical, and Iâll dismiss it as expected in the face of such groundbreaking science. But then the Bureau of Chemistry will find my oil to be drastically overpriced and of limited value, and Iâll face criminal charges for fraud.
So too here. If you hide your details behind a paywall, nobody will find your results credible or interesting.
(Sidebar: Iâd be shocked if Shin Jinseo wasnât the strongest player of all time.)