I wasn’t really looking at the EGF system when I made this post, but it is interesting that the EGF rating win probabilities align closer to OGS than to the AGA. The OGS and EGF probabilities of 14.8% and 11.8% seem too high to me while the AGA win prob of 0.008% seems too low.
What I find interesting is how each three systems seem to work in practice reasonably well despite totally disagreeing with other.
Most tournament games would be between players 1 rank or maybe 2 ranks apart. So I think it won’t matter much if predictions for large rank differences are off by a few %, because such games are uncommon and won’t strongly affect the rating system overall. By design the MacMahon system tries to pair players that might have about 50% chance against each other (challenging for both), avoiding lopsided games.
The EGF and OGS both recently overhauled their rating algorithms, and to my knowledge were in communication with each other during this process. I don’t believe it was an accident at all that they are very similar.
TMK the AGA algorithm has not been updated recently, which probably explains why it is further afrift from the others.
The OGS used the EGF system in the past. Then at some point OGS changed to Glicko-2. That has an uncertainty parameter in each player’s rating (perhaps somewhat similar to the AGA rating system).
The EGF changed their prediction curve in early 2021 (from the green curve to the blue curve in one of my posts above), based on analysis of their historical data. That was part of a larger update of the rating system. I was part of the commission that made the recommendations for the update.
OGS also changed their prediction curve in early 2021, based on analysis of their historical data.
There was some contact between me and @anoek at the time, but we both did our own analysis on our own data sets. So I suppose the reason that both predictions curves are similar is due to our statistics being similar. At the time OGS also received the historical data of the EGF to analyze, but I don’t think it affected the OGS rating system update.
The resulting prediction curves of OGS and EGF are not completely the same BTW. They diverge significantly for higher dan ranks:
Good luck trying to figure it out. I can’t imagine this is possible though, unless you specify which ranking system to consider. As far as I know there is no universial definition of 1 dan or 1 kyu etc. These terms only make sense in the context of a ranking system.
It is true that there is are no universal absolute ranks, but rank differences are somewhat universal, because they are defined by handicap required to even the winning probability. And 1d does not vary that much across the world (I don’t think there is much more than about 4 ranks range), that approximate statistics are totally invalid.
Handicap used to be the way to determine relative ranks within a local group. If there is no longer a relation between handicap and ranks nowadays, ranks have little meaning IMO and it might be replaced by pure Elo ratings based on even game winning statistics alone.
But I don’t think things are that bad. Checking EGD statistics of handicap games, handicap according to rank difference does seem to work to even out winning probability:
Overall, black wins about 45% of properly handicapped games, which is a bit lower than 50%, but that is expected because traditional handicap falls half a stone short of full compensation.
I think OGS posted similar statistics of their handicap games.
Edit: taking data from a longer time period (from 2006). Now black’s winrate is a bit lower (more like 34-42% winrate instead of 45%), suggesting that the handicap fell a bit short to compensate for the skill difference implied by the rank difference, especially for larger handicaps.