Great find. Obvisously, I’ve been oblivious to that.
In which case, during the process of this change (or simply as an effect of it) the ranking formula is clearly broken. @anoek
Great find. Obvisously, I’ve been oblivious to that.
In which case, during the process of this change (or simply as an effect of it) the ranking formula is clearly broken. @anoek
I’ve seen some suggestions in other threads that it appears OGS is still counting handicap ½ stone wrong (a “1-stone handicap” should be treated as a ½ stone handicap by all parts of the system that actually matter, since it undoes half the handicap by removing komi)? Would this be a large enough effect to cause this?
Doesn’t seem like it: here we see a person leaping up through Dan ranks by defeating a 25k playing at an effective rank of 16k (9 handicap) … a Dan should still only get a minute increment from such a victory.
When we say “remove ±9 rank restrictions”, we don’t mean “allow greater than ±9 handicap”, right? because I trust the rating system to handle the former, but not the latter
well, on 9x9, isn’t our current estimate that handicaps are 6x stronger? so 16k-9d isn’t 24 ungiven handicaps apart, it’s 4
I’m pretty sure we mean the former.
Hah, well I’m totally out of my depth now, especially if that is true.
Maybe it is actually OK! Maybe if a 3d defeats a 25k with 9 handi on a 9x9 board, that really is worth a dan of rank!
This doesn’t match my intuition, but I’m totally aware my intuition is tuned to 19x19.
If it’s true, then there is nothing broken. Just some people abusing bots that need their games annulled!
I don’t know, that sounds like a massive problem, especially for the rating system combining ranks on 9x9 and 19x19.
It sounds quite a large extreme, on what would otherwise be maybe an ok approximation.
Wouldn’t it only be a problem if it were even a possible outcome?
My statement was based on the assumption that “it is so hard to win at 9x9 with 9 handicap that in effect this huge theoretical rank boost would never happen”.
(Or if it did, then the winner would have demonstrated their prowess suitably).
I dunno - it’s probably beyond speculation now: the maths needs to be debugged.
my favourite is looking at royalleela’s losses.. idk at what level a bot is considered superhuman in strength, so i guess when it loses it can be argued to lose to a strong human.. (example; 친선 대국)
as for ranks being broken or not.. i think it’s very difficult to tell whether ranked bots have a negative or positive impact on peoples ranks, since ranked bots were implemented on the same day the ranking system was fundamentally changed from elo to glicko.. and since a vast majority of ogs games are not live 19x19 games, I have no idea what is the value of any given ogs rank without inspecting the users history in detail (and also branching out to the opponents they played)..
That’s the problem. It’s absolutely not that difficult when the bot is so dumb
Yes - but abusing dumb bots is a different problem with an easy solution: report it and it gets fixed.
The remaining discussion is around whether outside of that case, there is any problem.
It get fixed case after case, not in a global way. It’s a trick still available and you don’t need to be so strong to elaborate a strategy on how to use it.
Maybe it won’t happen for normal matchups between players that know how to play Go.
But is that because of the players skill at Go, or is it purely a limitation of the board size? Playing with 9 stones on a 9x9 board, might be more like playing with 25+ stones on a 19x19 board. Not to mention that the chinese rules bots use for playing don’t have a standardised placement for those 9 stones, so you can’t really guarantee consistency with those handicap stone placements.
Combining 9x9, 13x13 and 19x19 ranks into one makes sense if, regardless of what board you’re playing on, your general skill at Go correlates well enough to be useful to predict wins on other board sizes.
I think reading, life and death, endgame, tesuji and so on don’t care too much (except for some special cases) what board size they’re on. So training these and playing on a 9x9 or on a 19x19 should in theory still be a way to improve overall.
We’re in a situation though where
did the winner really demonstrate anything? Or was the loser (25kyu bot) actually be the one that really demonstrated something about their ability or knowledge.
I don’t think the skill of winning against 9 stones on a 9x9 board says anything about the strong player, but says something about what the weaker player still needs to learn about Go.
Did you look at the games?
I feel like “hmm… whatever… you make a persuasive argument, and yet … still really we need the maths investigated in detail: intuition and speculation can’t take us further”
It’s earlier in the thread. They were demonstrated to be bot-abuse … IIUC!
Bot abuse: how good is supposed a 25k bot to be?
I don’t want to encourage any bot abuse. But it seems quite a debate if a relatively weak player reach to elaborate wins with 9 stones on a 9x9 against too dumb bots. Of course he could get some doubts to get ranked soon as a very high Dan player.
Yes I agree.
I don’t even mean to make a persuasive argument just for the sake of it.
What I mean is suppose something like
Let’s suppose we’ve done some extrapolation of what handicap should look like on 9x9 (it did get revamped at one point), should it have a similar cutoff the way 19x19 does on handicap stones.
We have a cutoff of 9 stones on 19x19, partly because that’s where the standard placements end.
It’s probably the case that ranked games should probably be cutoff after about 5 stones handicap on 9x9. I don’t mean whatever the number 5 translates to (maybe it’s 2 stones and 4.5 komi), I mean 5 physical stones on the board.
There’s a good chance that when a player knows how to play Go, they can beat strong enough bots when they have 5 stones handicap on the board in 9x9, just connect your stones together and there’s not enough room to live for white.
For having more than 9 stones on 19x19 maybe you can argue that it starts getting progressively harder to win by demonstrating some skill as a strong player as opposed to just general lack of knowledge by the weaker player. If you never learned to close the borders of territory, is that going to tell us the difference between a 1d and 2d beating someone after giving 40 stones handicap? Probably not.
If you had 25 or 30 handicap stones on 19x19 and lost, it could’ve been because you didn’t count the board and the strong player did, should that be worth some sizeable amount of rating points?
Common sense like no more as 9 stones on 19x19 should be enough to fix a limit. We don’t have to explore deeper. There are other limitations for what a game is legit to enter the rating system, like standard size, komi… So a simple first step is to follow the uses here too.
For 9x9, and 13x13 I dunno if there are some common uses.
But is it bot abuse if the bot just doesn’t close its borders?
At what point does it transition from playing the bot normally to abuse?
The fact that’s it’s ranked at all is probably the real issue, and the real thing that’s transitioning it from ordinary playing to seeming like abuse.
Example game:
Normal-ish game
Except the bot doesn’t close its borders and fills in its territory or adds some dead stones.
But this is what I’m talking about. The games “exploited” were typically 9x9 games with 9 stones from what I can see.
I think something like 5 physical stones is roughly where people will not want to give more hanidcap stones on 9x9, except against complete beginners.
If we look at what they are exploiting? It wasn’t like a ladder or something, the bot is just bad, but playing a bad bot isn’t an issue is it? It’s that they’re playing a ranked game with the bot at high handicap that seems like the real issue.
So it is worth looking at whether we need more sensible cutoffs on say 9x9 or other boards, whether there’s some bug in the code, and not just chalk it down to exploiting the bot.