I think beginner level is too inconsistent for this, not an accurate point like perfect play. Beginner level may even drift in time (with cultural changes in decades, etc).

I’d say “perfect play” also drifts with time. It’s just that we are so close from the first instances of perfect play that it’s hard to tell.

I don’t understand this. If it could drift even by 0.01 points, if was not perfect to begin with.

True, but perfect play cannot actually be determined so it cannot be used to define ranks.

We could say “X stones from Katago” though.

On the other side, I agree “beginner” is too inconsistent but we could use “random play” as a basis

Do we have some conversion of OGS ranks to X stones from a particular Katago bot? Does anybody have the data?

I mean for a specific instance of katago one could do something like

https://online-go.com/api/v1/players/902691/games?handicap__gt=0&white_lost=true

to filter games where White lost and the handicap was >0, with the assumption that Katago will be white. One might have to grab them page by page though I’m not sure.

On the other hand…

One could ask here for someone to do this with the big data download

Yes, you can substract your rank from Katago’s rank.

This oc implies a bot that mostly plays handi games and a rating system which correctly handles those games.

IIRC you don’t need an actual perfect player to be able to estimate your distance to it. This is because as strength increases, so does consistency as well. So you cannot directly measure the strength of a player (only compare him to others), but can still get an idea of the consistency of his play.

I can’t prove it ofcourse, but I don’t think a perfect player can give KataGo running at 10 million playouts/move 9 stones handicap. My gut feeling is that the proper handicap would be close to 2 stones.

I’d be happy to help, but I don’t understand what I should do.

Sorry, I may be dumb but I still don’t get how you would estimate your distance from a perfect player without knowing what perfect play is. How can you evaluate someone being “two stones from perfect play” ?

I agree with @gennan that you could go by a gut feeling that KataGo is somewhere around 2 stones from perfect play, but that’s just a guess.

By measuring consistency (instead of strength) and its changes.

For example, you can compare how consistent is the play of 1d, 3d and 5d players, and how fast that consistency changes in this range. If it changes, say, 10% between 1d and 5d, this means more distance to perfect play than if it changes 20% (just random numbers for example).

How do you define “consistency” here ? How is it measured ?

It should probably be **in**consistency, and not easy to quantify. One measure could be how a player performs vs 1 stone weaker opponents (the lower his inconsistency the higher his winrate). Or ask a strong bot.

I don’t think you can use consistency as synonym with strength though.

Even assuming they are indeed *statistically* correlated in a meaningful way, knowing the consistency of a given player will not be enough to deduce its strength. You can make a guess based on the general trends you observe (*People with this level of consistency tend to usually be around this level)* but an individual will always deviate more or less significantly from the statistical norm.

Also, since again the notion of perfect play is purely theoretical, I don’t see how you can measure the gap in strength between, say, Katago and perfection. You could measure a gap in consistency but how much stronger in stones would you need to be to bridge this gap?

In my view a ranking built on Go consistency would be just that, but cannot be used as equivalent with a ranking built on Go strength.

One question that is interesting to ask: does Katago with 2 handicap stones sometimes loses against itself? If the answer is yes, then Katago is more than 2 stones away from perfect play.

Basically, what is the win-rate vs player strength at a specific handicap for playing against for example katago-micro.

I guess one would need to filter out some games with premature resignations or maybe filter for ranked games.

So we could deduce what handicap does player with rank x needs for an even game. And see if it is indeed (9 - x).

You could define consistency at different levels of play by the Elo width of ranks (where ranks are assumed to be determined by handicap, as in **n** ranks difference can be compensated for by black getting to play **n** moves before the game starts with black’s turn and white getting komi, up to about **n** = 15).

Determined from EGF historical winning statistics, ranks around 15k are about 50 Elo wide, ranks around 1d EGF are about 100 Elo wide and ranks around 7d EGF are about 250 Elo wide. Going into the pro range, you get ranks (not pro ranks, but handicap ranks as defined above) of about 300+ Elo wide.

At perfect play, the Elo width per rank would approach infinity (or perhaps a finite, but very large value, due to the fact that score is an integer value instead of a continuous value, so komi handicap increments of less than 0.5 points are meaningless).

So you can fit an asymptotic curve through those Elo widths derived from winning statistics at different levels of play, to get an estimate for the highest rank possible = perfect play.

Using this method on EGF historical data, I arrived at an estimate of 13d EGF for perfect play, from the blue curve in this Elo width per rank graph, which is used by the EGF rating system since April 4th 2021 (red curve is what OGS uses):

Vertical axis is the Elo width per rank.

Horizontal axis is EGF rank expressed like internal OGS rank scale:

- 0 = 30k
- 10 = 20k
- 20 = 10k
- 30 = 1d
- 39 = 10d = more or less highest level achieved by humans like Go Seigen and Shin Jinseo,
- 42 = 13d = max rank in EGF rating system ~ perfect play?

I looked for katago-micro in the 27M games dump.

On 19x19, we have only 86132 finished games.

51 of them are ranked and they are all against other bots.

opponent | games |
---|---|

12bTurboSai | 17 |

20bTurboLz-Elf-v1 | 2 |

60b Katago 1 playout | 3 |

doge_bot_2 | 2 |

IntuitiveSAI | 12 |

Spectral-1k | 14 |

Spectral-4k | 1 |

Total | 51 |

Katago-micro’s rank is mostly 38 (9d): 59038 games out of 86132 .

Out of 86081 unranked games, 21709 are even games.

katago micro won 20522 of them.

Strongest opponents are:

- hjkl123, won 207 games out of 227 (91.6 %)
- kkxxcake, won 86 games out of 409 (21%)

Out of 64372 handicap games here is distribution of handicaps:

Here is breakdown of opponent’s rank, which spans from -8 (25k) to 44 (9d+)

I’m starting to feel disoriented.

What’s next step?

Are there any games (ranked or unranked) of it against known professional players? (like players with green names). If there are, maybe they can be used as anchors.

And If there are not enough, maybe expand to players who play with these pros the most and used them as secondary anchors and find how many games they also played against katago-micro.

I was thinking some kind of scatter plot of handicap given vs rank ( of katago’s opponents) in each of the games, and maybe colour wins one colour and losses another (red and blue ? or maybe #d55e00 and #0072b2 which I think are the colourblind/accessibility theme colours from the main site ). I would hope to see some kind of approximate straight line or more likely a curve through the data which divides wins/loss (except some abnormal data points) which one might be able to imagine is an appropriate relation between rank and handicap received by katago.

That is some kind of curve f where f(rank)=handicap. There’d be many such curves (if one exists) with discrete data points, but it doesn’t really matter how it interpolates between the non integer values.

That’s an intuitive guess which could be way off, but still might be fun to visualise.