Ranking system is BROKEN

The rating calculator only considers 19x19 games (or even games on other board sizes). When White gives 9 stones in a 9x9 game, this is considered as a much higher handicap by the rating system.

5 Likes

Is there a way to switch board sizes in the rating calculator or is that a topic for another day?

I don’t think it’s currently possible.

Note: the rating system considers that 1 stone is worth 6 handicaps on 9x9 and 3 handicaps on 13x13. So giving 9 stones on 9x9 to a 25k would be like playing an even game against a 30d (rating ~6900).

4 Likes

Doesn’t it go the other way? If you give 9 stones to someone, it makes them stronger, so it’s like an even game vs 18k … except that Samraku asserted that one stone is not one rank in 9x9 :woman_shrugging:

1 Like

It’s connected though. One “exploit” victory wouldn’t do much in any system. This thread is about a player who did the same thing over and over hundreds of times. How much rating should hundreds of consecutive victories over the same opponent deliver to anyone?

I’d say it should cap at maybe 3x or 4x the normal amount, not just keep giving them the same boost over and over and over and over.

I can see what you mean, but I think that it’s not “the way it works”.

“The way it works” is that any victory gives a teeny amount of rating. I guess this could be rounded down to zero - that would fix the “hundreds of consecutive”.

I still want to exclude the question of exploits. A player can legitimately play hundreds of these games, as the “teacher” scenario shows.

So let’s not get distracted by exploits. The ranking system needs to work for normal cases - of which the “teacher scenario” is one.

We’re still ducking and weaving around the actual blatant problem, which is not one of “hundreds of consecutive teeny wins accumulating”, it is “a person rocketting up this way”.

I agree. And i would even go further: I don’t see any reason at all for having ranked games against bots. The only reason I can think of is to get rid of that “?-rank”, if you are a new user on OGS. But other than that, all games against bots should be unranked.

1 Like

Effectively 0 on a 9x9 board because it shouldn’t be ranked.

There’s no standard placement for the stones is there?

Free placement handicap depends a huge amount on whether you place those stones well or not.

I think even 5 stones on a 9x9 should be a win for a 25kyu ish player. Maybe you need to be 24 or 23kyu player, I’m not exactly sure.

Katago-micro just gives up on 6 stones.

18kyu beats 9d bot on 4 stones

Like you can give up one of the four stones and still win.

25kyu beats 9d bot on 5 stones twice

(I think Katago unlike a human, might not be trying to trick the 25kyu, you could definitely try a bit harder to invade and live on both sides, but I think it becomes very difficult anyway once whatever first stone you place is more or less immediately capturable. Maybe you can see it as a player skill, but it feels like exploiting the 25kyu which I’m not sure is worth ranking in the way we discussed, I’m not sure it differentiates a 1d from a 3 or 4d)

Yeah, I would think so. In which case the screenshot in the OP shows a bug, eh.

But you’re confusing me here, because you go on to make an argument that sound like “actually, 25ks can beat dans with a handful of stones on 9x9”.

I think the argument is bad though, and out of place in this thread, because once again it brings in bots. (hair getting torn out at this point :woman_facepalming: )

IF a 25k can beat a human dan in the way that you illustrated, then the ranking outcomes we’re seeing might be valid eh? This would contradict your first statement.

BUT the fact that bots can’t cope with this wierd case of go setup really means NOTHING in this context. Bot’s just are not designed for this, and how they respond to “unexpected inputs” is not really helpful to explore while we’re discussing what should the case be with humans.

Why should it be ranked though?

If it should be, why don’t we rank 30 or 40 stone handicaps on 19x19?

Beginners should be able to beat dans with for example

I recall being told by a friend that a Chinese Go teacher might use this handicap with beginners until they win.

You can see more extreme handicaps here.

There’s a point where you’re just tricking a beginner, and not really demonstrating skill in winning the same way vs a 20kyu, 10kyu etc.

It’s very much an extreme that should probably lie outside the ranking system.

For other extremes you could ask why we don’t rank 5x5 games and 7x7 games where a beginner might be able to beat a dan player?

1 Like

I don’t know why it should be ranked. It makes no sense to me.

I didn’t even know it was possible - this thread uncovered the new information (for me) that we comparatively recently made it possible.

That’s why I said

But there are counter-examples too, 3k beating 21k with 5 stones handicap:

I’m not sure if katago knows how to best take advantage of kyu players’ weaknesses necessarily.

I think this scaling factor for small-board handicaps is too much, particularly when in the DDKs. As a 4d I consider myself having decent chances against a total beginner on 9h on 9x9. The Cambridge 13x13 tournament uses a scaling factor of 2.5. I have won games giving 13 handicaps on 13x13 to kids who are 30kyu which according to scaling factor 3 is 39 stones difference so I would need to be 10d to have a 50/50 chance which is evidently not the case. 13*2.5 = 32.5 which is indeed about our rank gap.

2 Likes

Sure and I’ve added

There’s only so much things that are worth ranking.

We also have the idea that 1 stone should be approximately equal to 1 rank difference that comes from 19x19.

The point is that that breaks down at some point. It’ll stop being transitive with high enough handicap that if I can beat person X with 20 stones, and person Y is 2 ranks higher the I should be able to beat them with 22 stones. At some point you’ll beat everybody as a 25 kyu with enough stones on a given board size - there has to be a cutoff, and I think ordinary humans won’t really want to play with more than 5 stones on 9x9, unless you’re genuinely teaching someone how to play.

But if we’re ranking how well you can beat beginners with high handicap specifically what are we aiming for? Are we ranking learning the rules?

It would best be set as a separate ranking system, that you might use in an IRL go club. It might work perfectly fine there to have a dan player beat various levels of beginners, and maybe it works well to differentiate those beginners. I think gennan has some experience with that

It’s not impossible, but again I think theres cutoffs.

My feeling says 5 on a 9x9 is a reasonable cutoff, but maybe like gennan you could push it to 7/8 stones on 9x9 and still have it roughly measure some progress (if that’s the aim).

3 Likes

1 stone on 9x9 is considered by OGS to be equal to 6 ranks.
So 9 stones on 9x9 is considered by OGS to be equal to 54 ranks.
54 ranks stronger than 25 kyu is 30 dan.

1 Like

Oh, misread on my part, I read 30k :squinting_face_with_tongue: :woman_facepalming:

Well, that explains the massive bump in rank then. The low-dan person achieved a victory that should only have been achieved by a 30dan. No wonder their rank shot up.

The only problem: we don’t believe it.

It beggars belief to think that the 3 dan achieving this did it through 30dan skill.

Rather, we understand (intuitively) that the result was entirely due to the lack of skill of the 25k in question, and that this configuration (9 stones on 9x9) simply isn’t good enough to discriminate anything about player skill.

We used to have this shared understanding embedded in a +/- 9 handicap range (for 19x19 at least). We lost that along the way. Hence the nonsensical result we see in the top post.

3 Likes

But then again … is it perhaps that case that a true 25k simply would not lose with 9 stones. :face_with_monocle:

A “true” 25k knows the rules, probably doesn’t play self-atari etc.

The problem we face is that our 25ks are not actually true 25ks an awful lot of the time.

That’s because they are typically beginners who haven’t even learned the rules, and/or who’s rank has not stabilised.

It might turn out to be the case that their rank uncertainty simply isn’t high enough (if it were high enough, the dan would not get any rank from the game). :face_with_monocle:

7 Likes

I agree this is another relevant point (and assuming say that the system just treats all below 25kyu players as 25kyu).

I believe the reason we don’t go below 25kyu officially is because handicap stops working below a certain point.

As in it stops being a relevant factor and hence a useful predictor of who is going to or should win a game, far below 25kyu.

That’s at least what I recall reading:

Now since then we have the more fine grained handicap on 9x9.

One could imagine that if ranks beyond 25kyu were introduced, one could just make an exception like two 25kyu+ players don’t give or receive handicap on 19x19, but still follow the new 9x9 handicap for those games.

It’s not a very clean system, but it could work better if aligning handicap to rank is the main impediment.


I linked the discussion thread earlier but this was the announcement thread where allowing players to play without rank restriction was announced, for completeness and because searching can be a pain on discourse:

I don’t know of any universal criteria to determine what a “true” 25k is, but in my youth club they are not novices anymore. I give 4-5 handicap on 9x9 to 25k in my club. And those players give 3 handicap on 9x9 to novices (who I rank at 40k).

5 Likes

I think it’s possible, but only because 25k (can be) so bad. Kind of like how someone can lose a tic-tac-toe game even though most people know how to force the tie.

Winning at tic-tac-toe says only that I have a rough sense of the game and my opponent is very weak. I believe this is the case with the 9x9 9h as well, so servers would do well to exclude this type of game from the rank system.


Edit: i read on to see discussion of “true” 25k vs OGS 25k. When i wrote this comment, I was thinking of the latter

3 Likes