What are the factors that affect the wide ranking fluctuation on amybot-ddk?

Hi, I’m new to OGS and go in general. Have been playing a lot of games against amybot-ddk on 9x9 and observed a fairly broad fluctuation range on the bot ranking, even through a single day.

I’m curious to understand this a little better (cc: @siimphh in case you can chime in):

  • Does this ranking fluctuation really mean the bot strength changes during the day? Is that bounded by the compute resources available to the bot at a given time?
  • Is this due to new players that are not well calibrated to their actual rank affecting the bot rank?

I would also be interested in knowing about this particular bot implementation (I have some background in ML). I was looking for a github repo or some documentation online but haven’t found anything yet.

Thank you!
-Javier

4 Likes

Your second bullet is probably a factor. Also, some sandbaggers probably deliberately lose to it to weaken their rank. Similarly some airbaggers may use it to get easy wins.

1 Like

I think the fluctuations are expected for players that play lots of games. Just today, amybot-ddk has played > 1000 games. When I look at my opponents’ ratings graph (bots and humans), I notice that players that have > 5000 rated games also have lots of fluctuations in their ratings graph.

As for the range, I’m guessing it’s because amybot isn’t too picky about its opponents. In its active games page, I can see games against 20kyus up to 2dan.

You might also be interested in taking a look at Budgie 9x9.

2 Likes

The ranking system has problems to place a player correctly if they play mostly against players with a huge rating difference. These games increase the volatility of the rating, and this in turn is increases the effect of each game result on the rating.

4 Likes

Typically amybot should play the ~same strength game-to-game, but I do have a regular process to check for the ratio of recent wins/losses against players in the right range and slightly adjust search parameters up/down if it’s not in the right range.

So I would +1 the other interpretations up the thread.

2 Likes