Why do bots have different announced level VS in game level?

This appears to follow logically, but leaves the corollary challenge: some indication of strength is still needed for practical purposes (i.e., model selection choice) - hence something like your ”weak” vs ”strong” labeling idea.

Unless these particular bots are dynamic LLM AI models, the constant changes in their reported strength remains an artifact of the ranking algorithm, not a genuine change in playing capability. Unlike learning humans, they are static over time. So ”weak” will remain weak -regardless of how many victories it achieves. And the same goes for ”strong” despite a losing streak.

1 Like