Huge disparity between the probability of a 1dan defeating a 1kyu between the OGS and AGA ranking systems

John_C · August 21, 2021, 10:28pm

It’s interesting that 1kyus have 6 wins out of 25 games against you, a 24% win probability. You’re right it’s anecdotal, but it’s still interesting.

John_C · August 21, 2021, 10:28pm

I wasn’t really looking at the EGF system when I made this post, but it is interesting that the EGF rating win probabilities align closer to OGS than to the AGA. The OGS and EGF probabilities of 14.8% and 11.8% seem too high to me while the AGA win prob of 0.008% seems too low.

What I find interesting is how each three systems seem to work in practice reasonably well despite totally disagreeing with other.

gennan · August 21, 2021, 10:59pm

Most tournament games would be between players 1 rank or maybe 2 ranks apart. So I think it won’t matter much if predictions for large rank differences are off by a few %, because such games are uncommon and won’t strongly affect the rating system overall. By design the MacMahon system tries to pair players that might have about 50% chance against each other (challenging for both), avoiding lopsided games.

BHydden · August 22, 2021, 12:29am

The EGF and OGS both recently overhauled their rating algorithms, and to my knowledge were in communication with each other during this process. I don’t believe it was an accident at all that they are very similar.

TMK the AGA algorithm has not been updated recently, which probably explains why it is further afrift from the others.

gennan · August 22, 2021, 8:14am

The OGS used the EGF system in the past. Then at some point OGS changed to Glicko-2. That has an uncertainty parameter in each player’s rating (perhaps somewhat similar to the AGA rating system).

The EGF changed their prediction curve in early 2021 (from the green curve to the blue curve in one of my posts above), based on analysis of their historical data. That was part of a larger update of the rating system. I was part of the commission that made the recommendations for the update.

OGS also changed their prediction curve in early 2021, based on analysis of their historical data.
There was some contact between me and @anoek at the time, but we both did our own analysis on our own data sets. So I suppose the reason that both predictions curves are similar is due to our statistics being similar. At the time OGS also received the historical data of the EGF to analyze, but I don’t think it affected the OGS rating system update.

The resulting prediction curves of OGS and EGF are not completely the same BTW. They diverge significantly for higher dan ranks:

martin3141 · August 22, 2021, 9:36am

Good luck trying to figure it out. I can’t imagine this is possible though, unless you specify which ranking system to consider. As far as I know there is no universial definition of 1 dan or 1 kyu etc. These terms only make sense in the context of a ranking system.

gennan · August 22, 2021, 10:20am

It is true that there is are no universal absolute ranks, but rank differences are somewhat universal, because they are defined by handicap required to even the winning probability. And 1d does not vary that much across the world (I don’t think there is much more than about 4 ranks range), that approximate statistics are totally invalid.

BHydden · August 22, 2021, 10:34am

Not including Japan… Right?

jlt · August 22, 2021, 10:43am

the question is whether two people n ranks apart are really n stones apart or not.

gennan · August 22, 2021, 10:44am

I think 1d EGF is about 4d in Japan (at least it was when I was there in 1990), so about 3 ranks difference.

gennan · August 22, 2021, 10:48am

Handicap used to be the way to determine relative ranks within a local group. If there is no longer a relation between handicap and ranks nowadays, ranks have little meaning IMO and it might be replaced by pure Elo ratings based on even game winning statistics alone.

But I don’t think things are that bad. Checking EGD statistics of handicap games, handicap according to rank difference does seem to work to even out winning probability:

             Statistics of Handicap Games - strong side all (wins for weak side)
			 
 Gr.         H 1              H 2              H 3              H 4              H 5              H 6              H 7              H 8              H 9     
Diff  Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  %    Wins   Tot  % 
---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---  ----- ----- ---
  1   2296  5463  42                       1     1 100                                                                                                        
  2   1967  5977  33   1584  4080  39                                                                                                                         
  3   1038  3972  26   1053  3095  34   1091  2829  39                                                                                                        
  4    351  1434  24    642  2473  26    708  2234  32    910  2352  39                                                                                       
  5     80   445  18    213   949  22    484  1798  27    500  1640  30    678  1964  35                                                                      
  6     19   150  13     53   231  23    138   644  21    360  1388  26    380  1202  32    567  1575  36                                                     
  7      4    51   8     12    83  14     36   203  18    102   495  21    295  1097  27    248   891  28    410  1190  34                       1     1 100  
  8      1    18   6      5    38  13     10    63  16     52   181  29     69   398  17    219   807  27    212   720  29    338  1007  34                   
  9      1     7  14      2    18  11      4    35  11     10    72  14     23   110  21     55   333  17    143   575  25    167   509  33    351  1088  32  
 10      3    10  30      2    10  20      0    17   0      5    34  15     16    92  17     16   124  13     31   240  13    110   475  23    307  1231  25  
 11      0     8   0      0     6   0      2    10  20      1    11   9      3    35   9      9    59  15      5    34  15     20   152  13    235  1175  20  
 12      0     1   0      0     1   0      0     4   0      1     5  20      4    24  17     11    55  20      4    24  17      9    38  24    168  1079  16  
 13      0     3   0      0     3   0      1     4  25      0     1   0      3    12  25      2    38   5      0    10   0      3    16  19     92   769  12  
 14      0     2   0      0     2   0                       0     3   0      3     7  43      0    13   0      0    15   0      0     7   0     64   617  10  
 15      1     3  33      0     1   0      0     1   0      0     1   0      2     7  29      0     7   0      0    14   0      0     9   0     32   525   6  
 16      0     3   0      0     1   0      0     2   0      0     1   0      0     1   0      0    11   0      0     2   0      2     5  40     23   380   6  
 17      0     1   0                                                                          0     3   0      0     2   0      0     3   0      9   317   3  
 18                                                                          0     2   0      0    10   0      0     2   0                       5   240   2  
 19                                                                                                            0     3   0      0     1   0      4   184   2  
 20                       0     1   0                                                                                                            6   147   4  
 21                                                                                           0     1   0                                        1    51   2  
 22                                                                                                                                              2    43   5  
 23                                        0     1   0                                                                                           1    13   8  
 24                                                                                                                                              0    13   0  
 25                                                                                                                                              0     4   0  


 N.B.: Games with handicap greater than the difference of grade between opponents have been ignored
 
 Legend: Gr diff.    : difference of grade between opponents
         H <n>       : games with <n> handicap stones
         Wins        : number of wins for the weak side (black)
         Tot         : number of games
         %           : percentage of wins

 records: 737934

Overall, black wins about 45% of properly handicapped games, which is a bit lower than 50%, but that is expected because traditional handicap falls half a stone short of full compensation.

I think OGS posted similar statistics of their handicap games.

Edit: taking data from a longer time period (from 2006). Now black’s winrate is a bit lower (more like 34-42% winrate instead of 45%), suggesting that the handicap fell a bit short to compensate for the skill difference implied by the rank difference, especially for larger handicaps.