Yet another ratings thread

I guess it’s inconsistent no matter what you do. Translating handicaps to komi, assuming an extra stone is worth about 13 points:

Even game: W 6 komi
2 stones: B 13 komi
3 stones: B 26 komi
etc.

So what’s between an even game and 2 stones? B 3.5 komi? We can’t consistently make 13 point jumps between the handicaps. So they went with the least inconsistent looking option?

It seems here that you are conflating how many handicap stones someone should be given by the auto-handicap system, with how much the expected win rate should be adjusted in glicko2.

But I don’t know for sure if that’s what you mean. It would help if you answered agree or disagree to the statements. You don’t have to but it would help make it clear.

I don’t know why, that’s a question for the historians (@Counting_Zenist perhaps?). Although from what I’ve read so far, this isn’t an age old tradition.

Other systems have been proposed, such as “proper handicap” (simply one stone extra for every handicap).

3 Likes

If you just use different komi instead of additional starting moves and one stone is worth 13 points, then to even the game for a certain rank difference:

  • Same rank: 6 komi
  • One rank: 6 - 13 = -7 komi (so 7 for black)
  • Two ranks: -7 - 13 = -20 komi
  • Three ranks: -20 - 13 = -33 komi
2 Likes

Supplementary question for the pollsters. Considering all Go players worldwide over the last two thousand years, would we expect the mean rating to:

  • Increase
  • Decrease
  • Remain constant
  • Unknowable
0 voters
1 Like

Over time the lowest skill level stays the same (not knowing the rules), meanwhile the highest skill level fluctuates, but generally goes up (especially with AI it really only increases right?). Resources only get better over time. Median is probably roughly the same if I had to guess? I think there is a point where casual players kinda just don’t care enough anymore even if resources get better which is why I say median is probably roughly the same, I can’t say how you would factor in popularity changes and literacy etc though, and the population shifting from just a smaller type of group to the general population more. I’d expect the skew to increase over time (just a wild guess). I’m not really basing this on a huge knowledge of the history though, I just kinda have basic idea.

1 Like

Right, you could do that, but using stones does seem more appropriate. And then it gets awkward if you want to stay consistent.

Reposting with the right reply.

I certainly am conflating those two factors, the reason being if standard handicap is applied (e.g handicap = rank difference) there are no handicap bands where the game is supposed to be 50-50. In other words 1 handicap is typically played at the range [1,2) ranks difference, 2 handicap at the range [2,3) ranks difference and so on, while handicap compensation brings the rank equivalents to 0.46, 1.46, in the same way.

In other words handicap 1 is supposed to be an even game when rank 9.46k plays rank 9k at which point they would typically be playing an even game (at least according to handicap compensation).

Would that be awkward?

  • No rank difference: Black starts with one stone, white gets 6.5 komi
  • One rank difference: Black starts with two stones, white gets 6.5 komi
  • Two rank difference: Black starts with three stones, white gets 6.5 komi
  • etc.

I think that’s basically what people had to do to play handicap with bots like LeelaZero IIRC because it wasn’t trained for zero komi.

Awkward in that you still need komi in handicap games, whereas currently you don’t. But I agree that would be an improvement regardless.

If you use the same formula for rating compensation to handle unbalanced games, then changing the handicap rules doesn’t do anything really as far as I can see.

You can kind of think of handicap’s as a series starting from even = 0, the present handicap system hits the points 0, 1, 3, 5, 7, … and handicap with perfect komi to White hits the points 0, 2, 4, 6, 8, … You could also do a mixture hitting the points 0, 1, 2, 3, 4, 5, …

The origins of the handicap system come from the original definition of the Japanese dan rank system 400 years ago. Basically the idea is that players of the same rank alternate taking black (no komi) across a multi-game match. If one player won 60% or 80% of games in a ten-game match (depending on the period), they’d adjust the handicap in the next match, which was tantamount to accepting a rank difference between the players. For a two-dan rank gap, the weaker player takes black every game; for a four-dan rank gap, the weaker players takes two handicap stones every game; and so on for all even gaps, and the odd gaps alternate between the handicaps for the nearest even gaps (e.g. a three-stone handicap and four-stone handicap on alternating games). For the purposes of a one-rank gap, this meant the weaker player would take black in 2/3 of the games, iiuc.

I think this historical point is interesting because, while the modern amateur dan system omits the intermediate ranks with alternating levels of handicap, it shows the transitive properties of the handicap system have been in place for nearly 400 years. If A can beat B 50% of the time taking white no komi (or equivalently 60%-80% of the time in even games), and B can beat C 50% of the time taking white no komi (etc.), then A can beat C about 50% of the time in a two-handicap game. And this transitive property holds pretty strongly even across long chains of players. That’s what they found in Tokugawa Japan, that’s what KGS found decades ago, and I believe that’s essentially what OGS concluded (slight deviation from perfect transitivity) after empirical analysis of all the active ranked accounts a few years ago.

The reason the current handicap system is good is because empirically, it hits this transitivity, making a N-stone gap in rank between two players equivalent to the ability to win a N-stone handicap game or, equivalently (remember, this is just an empirical invariance) the existence of an N-player chain from the stronger to the weaker, where each stronger player can win 50% of the time with white no komi, or ~2/3+ of the time in an even game, against the next player in the chain. The problem with your proposed “perfect komi” system is that it fails this empirical invariance, so the transitivity property of the kyu/dan ranking system would be broken.

1 Like

1 handicap stone is equivalent to 2 perfect komis, this doesn’t break transitivity. For instance, suppose that perfect komi is 7, then if A is exactly 3 stones stronger than B, then playing a game without handicap stones, but with reverse komi 14×3-7=35 should give both players 50% chances of winning.

1 Like

Is this empirically true? (Sorry if I missed it above.) If so, it would be astounding, as one might think that slightly different skills and strengths would be activated in playing added stones, versus high reverse komi.

Maybe reverse komi doesn’t occur enough to have a lot of data on the question.

A tournament on OGS has been organized:

I didn’t participate or follow closely but it seems that the conversion of handicap stones into reverse komi worked well.

2 Likes

The handicap games (or teai 手合 in the Japanese terminology) system already existed for over a millennium that can be found in 棋經十三篇 (likely compiled in the 10th to 11th century, with lots of contents we know came from even earlier sources) in the collection 忘憂清樂集 (earliest printed version as early as 12th century)

It reads “凡棊有敵手,有半先,有先兩,有桃花五,有北斗七”, there is even an interpretation called “論棋訣要雜說” add footnotes about the paragraph

敵手 : “強弱均而爭先,謂之敵手” (strong and weak are even, and fight for the first play)
半先 : “強者饒弱者兩局先,弱者復饒強者一局先,所饒之子強者在中,以三局為一周” (basically the definitely of sen-ai-sen, weak player plays first, and then strong player plays first, and then weak player plays first, 3 sets of games in a cycle), etc. This paragraph is likely also compiled no later than the 12th to 13th century.

These contents are basically retranscribed in the more famous 玄玄棋經 which was compiled and first printed in the 14th to 15th century, and widely spread and reprinted in the 16th century (we have lots of versions of it that still survive). They were highly regarded in the Japanese Go community for hundreds of years, and were likely sources of their adaptations of the system. The custom of names of the “rankings” came from these as well.

Before the dan ranking system (and concurrent with the ranking system), there was already names for “player strength”, from the weakest to strongest “拙手”, “巧手”, “高段/高手”, “上手”, and they are likely inspired by the paragraph of ranking (品格篇) in the 棋經十三篇, with 9 pin (品) rankings, “守拙”, “若愚”, “鬥力”, “小巧”, “智”, “通幽”, “具體”, “坐照”, “入神”. There is no coincidence that the dan rankings were also mapped into 9 rankings.

The interesting thing is that the natural rankings likely were about a one-stone gap from 拙手 to 巧手, from 巧手 to 高段, and from 高段 to 上手 (名人 Meijin we know came later, and a special placeholder for the strongest players, not necessarily a “ranking”). The Great Houses and Godokoro (碁所) mostly just make the “準” ranking (qualified/semi) ranking already existed like 準上手, 準高段, 準巧手, as the fine ranking gap, and give numerical for dan rankings within the great houses (巧手, ranks lower than 5 dan), so players can compete/measure across the great houses (most of the time, used the “strength of player pool” as political tools to gain favors, and candidateship for the castle games).

The basic math of winning two out of three, and three in a set, had been around, at least in the 1st millennium, but likely a bit alien to us, and still not quite clear how they were implemented. They were obsessed with this 3 in one, and one in 3 ratio as far back as records went.

3 Likes

As far as I read this system it implies 1 rank difference expects a 50% White win rate at 1 handicap. As a result of this I still think the rating compensation of 0.46 or 0.54 of a rank is too small.

It’s possible of course to adjust the default handicap (though this is bucking tradition), to adjust the rating compensation system or to adjust the expectation that 1 rank is 1 handicap separated but I think with appropriate rating compensation it makes most sense to maintain the other traditions as is. Adding komi + handicap games between the komi + no handicap might also appeal I guess.

The other point I take from this is that the 1 rank = 1 stone part is far from a fixed scale. As we see from this the meaning of a handicap stone can change when imbalanced matches are involved as you are trying to balance an overall match system, not the expected result of a single game.

I have a question on how rating points are awarded or deducted after a match. Is it based on the ratings when the match starts or when the match finishes?

So imagine I start a correspondence game with someone who has the same rating as me. Then late one evening I go on tilt playing blitz and I loose 5 stones in rank.
When the correspondence game finishes which rating does glicko use?

thanks in advance

1 Like

When the match finishes.

2 Likes