It’s because the bot isn’t really 23k, it’s a weird combination of much better and much worse. Depending on whether opponents know how to take advantage of the much worse moves, the bot either wins easily or loses regularly, so big swings of either are more likely than with a more balanced human player.
Sorry to revive an old thread, but I’ve basically stopped playing against bots because of this issue and I wonder if there are improvements that could be made. I understand the bot ranks are all over the place because the bots are very inconsistent, but this information is not presented on the initial “Play” screen, it just shows a bot with a rank, and it’s reasonable to assume the bot would be approximately that rank.
If a range was shown instead, it would not only make it clearer that the game you get might show the bot as anywhere in between those values, but it would make it clearer how inconsistent each bot is (so you could try to pick one with a narrower range, or a range that is more on the right “side” of your rank that you’d like).
Starting a game with a bot that says 18k and then getting 23k is rather annoying. If it had instead said “18k - 25k” then maybe I’d pick a different bot instead.
I’m currently messing around with amybot model and exploration parameters and forcing it to play until boards are settled. And I have semi accidentally produced quite big swings in strength doing so. So maybe that’s what they mean.
If it is, then I should hopefully be able to stabilise their ranks soon, unless I accidentally cause more breakage of course
Some of the weaker bots would be more consistent if they made sure not to blunder very basic life and death at the end of a game.
If you’ve written a bot that is intentionally weaker by choosing the second best move or third best move at random, or just picking a random move generally at certain points, then it can just throw won games for no reason to almost any rank of player.
Let’s say it’s only throwing like 3/50 games completely at random, so p=0.06 that it loses to anyone, it will still lose a lot of rating to 25kyus, could be like 50-60 rating points a game. It gains like 6 rating points if it wins. I think maybe that’s a very slight rating gain on average
Works for easy positions but may be weird again for very complicated ones, a big fight for example or very hard to spot life and death on a human group. And the bot is like: “How can this stupid human make a -15 move, its 5k stronger than me. God please fix them so they dont ignore this ‘obvious’ life and death” xp
Have you looked at the examples I shown above - I’ve given screenshots.
I think even if the bot looked one move ahead and saw that it was about to lose 17 stones because they’re in atari, it could reasonably guess it was about to lose 30 points.
Yea, but does a bot know what an easy position and a hard one is? Im not in the topic too much but Id assume for the bot, a super complicated situation, for humans, is as obvious as a simple atari.
I don’t even know anything about this specific bot, but when a bot can play well enough to reach like a high ddk rank, it must be doing something correctly.
At that stage, it shouldn’t randomly be ignoring ataris especially in the late stages of games. Unless you’re ok with the bot being very volatile. But the whole point is to question why the bots have large swings in ranking.
If you imagine there’s two ways to program a weak bot
Your bot just is that strength naturally, you coded up a bunch of rules, tree search, whatever and it just plays at about that level, maybe stronger or weaker if it can look further or less far.
You actually have a strong bot but you want to make it weaker, so you introduce conditions to make it lose points intentionally.
In case 1, you can maybe manually patch flaws to stop it losing to certain tricks.
If it’s case 2 and you’re wondering why your bot isn’t at a stable rank, maybe you need to adjust when it decides to intentionally play a bad move.
My feeling based on playing Amybot ddk is that it’s based on a stronger bot.
It plays quite well overall in the opening and middle game. It’s fine if it can read 5 or 6 moves ahead to capture stones, it’s supposed to be ddk.
But then it ignores random ataris in the endgame like (and it responds to ataris otherwise)
The 19x19 amybot-ddk is currently playing just about as well as it can so all the blunders are in fact the neural network getting it wrong. I looked at what it was thinking for the first game you pointed out:
The engine does realize that F13 is the best move after a little bit of exploration, with the MCTS tree summary being:
But it still also thinks E7 is a fine move, and hasn’t explored black A3 as a response at all!
And in general, I’m super grateful for @shinuito for looking looking through games and pointing out these blunders! I could/should have done the same myself, but you really do highlight a major problem and I think your wider explanation of this is also spot on.
Fixing this does require training a better network though, but I think I have better training data available now compared to when I first trained the amybot model. And the first test networks seem to be less prone to big blunders.
It’s interesting, I was curious how the bot could play so well in the opening and middle game and then make random blunders like that.
I suppose that can happen with neural nets alright if they have some kinds of blind spots.
I wonder is if similar to but much simpler than the thing where Katago was struggling with liberties of groups that make circle/ring shapes. Like when the liberties are too far away from where the move is being played.
I wonder does making the liberties of groups explicit and factor into the nets/ai’s decision make an improvement. If it’s like an extra meta parameter to the algorithm.
But sorry, I don’t mean to sound negative about the bot, I just wanted to point out one factor I think contributes a lot to rank instability.
The amybots are very cool overall
The same would be true for Agapanthus, Bergamot etc where they might randomly play a worse move mixed in with otherwise Katago like play. Like they’ll be playing well and then just break and play dumpling looking moves and lose the game.
Technically this kind of move is bad, and it might throw the game, but not when the opponent goes on to play moves that let stones be captured, and the bot definitely knows how to capture
So this might be a different kind of volatility than comes from having a net not predicting the right move, but from initially picking a bad move, but then your opponents (it’s meant to be like a 25kyu bot) not being able to capitalise on it, and so you go back to destroying them
There are games where it will properly throw the game with bad moves.