Wow, so much discussion since I was last here a couple of days ago! Thanks shinuito for the link to 2020 Rating and rank tweaks and analysis . I haven’t read all 249 posts in that thread. But just looking at the tables near the top – blitz ranks predicting blitz results, overall ranks predicting blitz results, etc – I can’t help wondering if there’s a conceptual error in the analysis method.
Consider:
- Player A is better at blitz than live, and has a higher blitz ranking. Therefore they win only 30% of their live games due to being overranked, while the prediction based on their overall rank is 50%.
- Player B is better at live than blitz, so they win 70% of live compared to predicted 50%.
If there’s an equal number of games played by type A and type B players, then when you aggregate over all games, the errors will cancel out, so on average it looks like the system is working perfectly. Even if there’s more type A than type B (or vice versa), if those groups are only about 5% of all games, then you’ll have a bunch of unhappy people who are invisible in the aggregate statistics.
So what’s needed to resolve this is:
- Classify players into type A (blitz rank significantly higher than live rank, maybe >2 ranks difference?), type B (other way round) and “normal”.
- Count what proportion of players are in each group. Decide if groups A and B are big enough to care about.
- If we care, then check the prediction accuracy separately per group.
I respect any admin decisions regarding the effort and risk of changing things here versus the benefit, and fully understand if it’s just not a big enough problem to be worth addressing. But as a starting point, it would be nice to know if we’re just talking about five users in total or if there’s hundreds of people thinking they need to create multiple accounts to keep their blitz and live ranks separate.