Testing the Volatility: Summary

I fully agree with everything you say above the horizontal bar.

I don’t believe volatility is inherently good or bad, it just depends on what you want your ranking system to achieve. Maybe that’s the fact that is leading to this impression:

(in general it seems you have found that users want a stable rank, developers lean toward a rank that frequently updates to describe new data)

One of the main points I’m trying to get across is as follows:

Based on past posts, I have come to the interpretation that you want to test whether a ranking system which has lower volatility can produce better predictions about game results. This is a great goal, and probably achievable within reason.

However, past posts here and in other threads have led me to believe that you will judge such a model by how well it describes outcomes on some set of OGS games. I am just saying that, for any number of reasons which I have detailed before (elusiveness of a stable “true” rank, the need to update as new players are added or new matches occur, and the bias-variance tradeoff), one should be careful judging such a model/ranking system on how well it performs on existing data – you will likely end up increasing volatility if you push this accuracy too far, which I believe is the opposite of what you want.

Edit: found the root cause of this disagreement – it is the same as the one where you did not like this comment:

This is not a slippery slope or overly extreme hypothetical, it is simply how models like this work. There is of course a balance (and some trivial counterexamples that don’t apply to most real-world scenarios), but in general, you cannot have a model that is 100% accurate without it being more volatile than a model which is less accurate but also less volatile.

1 Like