2021 Rating and rank adjustments

luckily tsumego has a tradition of mismarking ranks on their puzzles, so it’ll be business as normal :stuck_out_tongue:

11 Likes

Awesome! Always kind of wondered what was up with the volatility, especially at lower ranks.

As a fellow dev, this explains a lot… ^^ Great job finally stamping this issue out! Excellent job @flovo & @anoek ! :slight_smile:

12 Likes

The OGS 25k lowest rank is not a rating floor. Ratings can go lower, but they are just displayed as 25k. The same goes for the highest rank of OGS 9d. It is not a rating ceiling.

AFAIK the OGS rating system has no clearly defined rating anchor. The thing that comes closest to a rating achor is the initial OGS 13k / 1150 rating assigned to new members. I don’t know where that number comes from exactly, but for OGS it seems to work fairly well as an anchor.

The EGF rating system has a different rating anchor: 7d EGF signifies a boundary between amateurs and professionals.
This boundary is a bit fuzzy, because some amateurs are stronger than this level and some pros are weaker than this level. But overall, it seems to hold up fairly well.

From tradional pro handicap as used in the (now abolished) Japanese Oteai pro ranking competition, the gap between pro ranks is assumed to be 1/3 amateur rank. So the gap between 9p and 1p/7d EGF is assumed to be about 2.5 amateur ranks, corresponding to 3 stones handicap, which was the traditional handicap in the Edo period for games between a Meijin (~World Champion) and a 1p/7d EGF.

This means that a World Champion would have a level of about 9.5d EGF. Some rare go geniuses may even be a bit stronger than that. Go geniuses, like Honinbo Dosaku, Honinbo Jowa, Go Seigen, Lee Changho and perhaps Shin Jinseo may have peaked out as high as 10d EGF.

So you could say that the EGF rating system is more or less anchored to world champion level being about 9.5-10d EGF.

At the bottom, the EGF rating system has a rating floor. It used to be 20k, but it will soon be lowered to 30k EGF to accomodate for weaker players (especially children) participating in beginners/children tournaments, which are becoming more common.

In my experience, 30k is a ballpark estimate of the level of an average adult novice that has just finished a beginners course in a club. They know the rules, they know about life and death, ladder, net, snap-back and seki and they rarely need help finishing and scoring their games. There will be a large individual variation ofcourse, but I estimate that 90% of adult novices will have a level between 35k and 25k EGF. In my club, I (3d EGF) give 30k players 6 stones handicap on 9x9.

When these adults novices continue playing and get some tuition, I think their level will usually go up to 20k-15k EGF after playing some 100 games. I give such players 4-3 stones handicap on 9x9.

1 Like

So you make some fair points in the rest of this post, and I don’t feel like I can contribute much more to the discussion of the main topic.

But, the secret to the 13k / 1150 “anchor”, is that it’s not the real anchor behind the scenes. That’s 1500 (somewhere mid-6k by the current conversion). the 1150 comes from a sort of compromise that came up when glicko-2 was first being implemented as people rated at 1500 would often complain about all the games against beginners as we could no longer choose ranks to begin with. So what it does is for provisionally rated players, it will display and matchmake as if your rating was your_rating - your_rating_deviation, while otherwise doing the math as normal for rating updates. For someone with no games they would – behind the scenes – have a rating of 1500 and an RD of 350, which would cause the displayed rating to be 1500 - 350 = 1150. This compromise was called “humble rank”, and the complaints seem to have disappeared since its implementation.

9 Likes

I am curious as to the reason for the floor and the ceiling. Why bother?

Why not let the rating system do its job? My preschooler may soon be playing 9x9 games online. If there is no floor (as I suggest), I see these advantages:

  1. He will be able to find someone of similar strength to play, making a win 50/50 or so.
  2. He will be able to track his improvement.
  3. He won’t have to lose dozens of times with no chance to enjoy much winning and to no benefit. Winning is fun. Why do you make winning an experience you won’t let him enjoy?
  4. If he hates being 45 kyu, I can comfort him and say that what matters is not where you are now, it is instead where you end up.

Similarly, why won’t you allow a 10 dan rank if a player—whether amateur, pro, or bot—earns it by being 1 stone stronger than the average 9 dan here?

Why pretend that no player is a stone stronger than our average 9 dans, and why pretend that no player is two stones or more weaker than our average 24 kyu? Let the rating system do its job to show what handicaps are likely to result in a 50/50 game.

Most people temporarily rated below 25 kyu assume they are likely to improve, and they’re generally right. Let them see their improvement over time.

I would make the lower limit 99 kyu, if you have to choose a number. You can announce that. Almost everyone will be better than 99 kyu. Maybe exactly everyone. I see no downside.

I would leave the upper limit undefined. You can announce that. Let the best players come here to demonstrate their excellence. I see no downside.

5 Likes

The problem with very low ranks is that handicap is not meaningful past a certain point. If both players are blundering 10-15 stones without compensation semi-frequently, a 9 stone handicap won’t have any effect on the result. The player making the fewest massive mistakes will win with any reasonable handicap.

1 Like

Thanks for replying, Zbingu. Has your claim been tested? I don’t think so, but in fairness, yours may well be a majority view.

My response is the rating system is designed to handle this issue, provided there are no artificial ratings floors. There’s nothing special about a 9-stone handicap on a 19x19 board. More than 9 can be allowed. Make the handicap large enough on a 19x19 or 9x9 board, and you can make the game 50-50. That’s obvious, I would think.

I would rather give players an approximately 50-50 shot than to force new players to get destroyed over and over to no purpose or benefit or reason.

So what if an estimate of 42 kyu is inexact? It is surely more accurate than to pretend he’s 25 kyu.

3 Likes

the reason for the floor is handicap, anoek reported that based on his data it isn’t meaningful below that floor. While I question the decision, I don’t have any particular complaints (except for the previous 25k hell).

As for the ceiling, the answer is tradition. 9th heaven is the highest heaven, 9 dan is reserved for the best players, 9 is a special number for the things that are the most of something. Now, normally what I would do is adjust the ranks to make sure that it’s reserved exactly for those within 1 stone of the best human, but it’s much easier to just cap it off, and I haven’t seen complaints about it from 9ds

Are there plans to update deviation between games? Would be somewhat helpful to see if the ranks for certain boards sizes are out of date. If it’s at all possible, of course.

2 Likes

I don’t have any data on that, just anecdotal experience from live clubs.

If you are going to have a ranking system with a step of one, than, by induction, one stone has to be a meaningful difference between ranks such that the win rate between 2 players one rank appart goes to 50-50 with the proper handicap.

Well, my way would obviously get rid of the 25k hell. No need for any 30k or 35k hell either. etc. No artificially caused hell/purgataory of any kind.

I’m thinking of trying shogi. If I’m 72k at first, terrific. I can try to improve and measure my progress.

If you make the floor for go 25k, then some 25k’s will be much much stronger than others. Let’s use the rating system to let all 25k-or-lower players have interesting, relatively even-probability games.

3 Likes

Thanks. That’s fine to me as a first approximation. There is an issue of transitivity to larger gaps in rank and larger handicaps. But I’m sure that equal players giving each other 9-stone handicaps will find their probabilities of winning will depend on who has black.

Well, I’d like to see the quote from anoek, but I can state some general principles: It is now becoming more and more clear in stochastic inference that classical null-hypothesis significance testing (NHST) is questionable as decent science. NHST is likely not what anoek engaged in, and even if it was, it seems not relevant here.

I would let anoek’s good model play out both above and below 25k. If all 25 kyu players win half the time against other 25 kyu players, there is no reason to change anoek’s floor of 25k. All of these players will remain 25k.

On the other hand, if some 25k players win 1/4 their games against other “25k” players, they should be maybe 27k or similar. If some “25k” player win 1/10 of their games against the top 25k players, they are probably 29k or lower. I don’t know anoek’s exact general algorithm for handling this, but 25k is too high for the moment for players that generally lose against other 25k’s. I want all “25k” players to find players at a similar level, or an approximately equalizing handicap against stronger players.

In other words, there is no reason for anoek or anyone else to worry about formal statistical significance. What he already has in place for 15 kyus is great when compared to assuming that all “25k”'s have equal playing strength.

3 Likes

Do you have 25k friends or something? How do you know how it is down there?

[duplicated post removed]

S_Alexander asked,

“Do you have 25k friends or something? How do you know how it is down there?”

I’m trying to benefit all current 25k players. That’s my primary goal here. If you have a category of player that you think will lose out in any way in my proposal, please let us know.

2 Likes

Probably could be its own thread, but I also do not understand the 25k “floor” heh. I doubt it matters though since anyone who plays enough to get an accurate rank will inevitably be stronger than 25k…

1 Like

I’m not completely following all this but doesn’t this mean that 25k’s are also generally winning against 25k’s?

But also discussion here:

And here:

Although I suppose these have been over taken by events somewhat

5 Likes

Thank’s a lot !
Nice game !
Maybe it’s time to me to give money.
I’m a debt for all those beautiful things. I’m passed long and Great time in OGS.
long life for OGS

5 Likes

25 posts were split to a new topic: A separate ranking pool for children?