I wonder if self-selection would actually cause the system to drift toward whatever is most intuitive!
What I mean is that, even if OGS chooses bad initial values for beginner/intermediate/expert, after enough newcomer self-selections, the pool will eventually stabilize around best fit for these self-selections.
I think the most likely is that they seed those at arbitrary points and just wait for it all to even out. They do not care about matching it up to USCF or FIDE, so drift is actually a non-issue to them, Frankly Iâm not certain why it would be an issue to us, really. Those entry points are the anchors, and so long as theyâre relatively close to the ideal distance away from each other is all that matters, and it doesnât have to be 100% accurate to create a good experience.
Though I have toyed around in my head with the idea of trying to use updating data that tracks players during the probational period and seeing the ratings that they escape it in, and using that to re-update the starting points
Im not super familiar with how glicko2 works tbh, but i wonder if we could ask new users âare you new to the game?â
If they are, they could start with 25k account. If they are not, then they could start from the current starting point for new users.
Would that cause the ratings to drift in the long run server wide?
From what ive seen, i feel like the main problem isnt that beginners get crushed on their few games as that is very much the normal thing that happens with brand new go players, itâs pretty much expected for the new players to lose a lot of games.
Problem arises when brand new beginner starts a game and their opponent resigns after realising their skill level, and those early âfalse winsâ just push beginners ranking upwards so it might take a while before their rank finally drops to 25k level
Another problem is that our ddk bots are so much lower ranked compared to brand new accounts that new accounts cannot challenge them to ranked games, and many of the beginners seem to play some games with bots before they feel comfortable challenging other humans.
They can challenge up to (or down to) 12k amybot-ddk, but not any weaker ones like 20k amaranthus or anything below that, like 25k amybot-beginner for example
I think amybot-ddk is strong enough to totally obliterate brand new beginners in normal game, It might make more sense if they could play against the very weakest bots immidietly if they want, ive seen a lot of absolute beginners asking why they cannot start with the weakest ones
(tho i should add, whenever ive seen beginners asking that, i have urged them to create open challenges and playing against other humans as the weakest bots play horrible garbage moves >___>)
If new OGS accounts have the option of registering as new/beginner/intermediate/advanced, Iâd expect this to be more accurate on average than setting all new accounts to 6k (the average OGS rank).
Iâd expect that the actual rank distribution of new OGS accounts is similar to the rank distribution of existing OGS accounts. So if new accounts get this option, I donât expect this to cause a significant rating drift of existing OGS accounts.
The Glicko rating system initialises new accounts with a high uncertainty, so the results of their initial games mostly affect the rating of the new account itself, not so much the ratings of their opponents. So as long as the inflow of new accounts is small compared to the size of the existing population, I donât expect it to matter much at which rating new accounts are initialised.
As for the issue of potential abuse by sandbaggers/airbaggers (intentional or not): as it is now, basically 50% of new accounts (are forced to) sandbag and 50% of new accounts (are forced to) airbag. I still have some faith in humanity, so I expect that a majority of new accounts can/will give a better estimate of their level than always selecting 6k, if given the choice.
As for alt-sandbagging: it would be easier to do that, but I think it would also be easier to spot them, especially when weaker players on OGS will get more accustomed to actually playing opponents of their own level.
I think server drift is a symptom that something is off. But if server drift seems to occur, the system could add or subtract a tiny amount of rating points per game to counter it.
IFAIK OGS knows the AGA/EGF ratings of many OGS players (who were willing to share their AGA/EGF rating IDs), so OGS could use that to detect rating drift compared to external rating systems. Either way, I donât think this is very important if OGS cares more about internal rating consistency than external rating consistency.
I think it is a good idea to increase the rating uncertainty over time, but perhaps the velocity of this increase should depend on the rating? I expect the uncertainty of a 25k rating to grow more in 6 months of not playing than the uncertainty of a 5d rating. IME higher ratings are more stable over time.
Some beginners I know IRL practiced against igowin (9x9). I think this bot plays decent moves for beginners, but that means itâs clearly stronger than a raw novice. To overcome that issue, you can just have the bot give (say) 4 handicap (on 9x9). I think it wonât take too long before a raw novice can beat igowin at that handicap, and beginners may enjoy progressing to lower handicaps against it (as stepping stones) and improve as a player in the process.
So perhaps some weak but decent bots can be made available on OGS at different handicap levels to accomodate a wider range of beginners?
Iâd expect it to be weaker, as itâs the people totally new to the game who are more likely to quit. A big 30k population who saw go in the news/media and googled it, came to OGS, had a try and didnât like it or got confused or got sad at getting smashed and gave up before getting a solid rank on the OGS rank histogram.
I donât think all new OGS accounts are raw novices, but I have no idea how large the fraction of daily new OGS accounts are raw novices.
I suppose it also depends on what one considers an âexistingâ account. Are those accounts created in the past year? Or accounts that played x number of games in the past 3 months? Or something else?
Either way, if they are 30k, create an account and abandon it after a few games, I donât think it affects OGSâs rating system much. If anything, it would affect OGSâs rating system less if they were allowed to register as âbeginnerâ (and that might also increase the probability they enjoy the experience and stay around).
I dont have any stats either, but my gut-feeling is that quite large percentage of new accounts who actually play any ranked games are beginners who give up after few games and are never seen again. These accounts usually bring in new rating points for the overall pool by losing some games
But some beginners will stay for longer and keep playing, so their skill increases exponentially and thus they take away rating points from the pool. Same happens with more experienced players who make a new account here and play any amount of ranked games
I think these do balance out in the long run, if not perfectly, at least âgood enoughâ to keep the overall ratings quite stable for prolonged amount of time so that the rating system doesnt need to be tweaked annually.
ps
Thats really just my own gut-feeling about it, i dont have any actual data to back up that claim.
anoek is probably the only one who knows how much ogs ratings have drifted since the last rating system tweak which happened in early 2021, maybe it would make sense to ask him?
For the @admins and @developers here
(@GreenAsJade and others) it doesnât matter what the math says. Forcing new players to get wrecked or strong players to play boring games is a horrible user experience.
If the math is a problem, fix the math or find a solution around it. But telling people to just suck it up and deal with it is not how you run a successful website.
Itâs not really a developer thing, itâs definitely an admin-and-community thing.
Personally, I donât like the current solution, thatâs why I always leap into these discussions and try to prod them into finding a better solution.
There are always competing awful experiences that require compromise. Thatâs what weâre here discussing: making sure we understand the reason why things are the way that they are, and what the compromises would be in changing that.
OGS is a successful website. I donât recall anyone saying suck it up. Weâre having the discussion, arenât we? Track record tells us that when a compelling case is made - one that comprehends all the factors - then change happens.
But the math is what computes the ranks, so if you try to fix horrible user experience by changing the math and the math causes equally horrible user experience in a different aspect, you havenât fixed anything for the better.
Currently, thereâs a problem of new players getting wrecked by strong players, including accidental sandbagging where strong players give new players wins and they accidentally rank up far too much.
But, if allowing players to select their rank is going to cause a lot of drift in a short amount of time, we can land other players in a lot of trouble. Hypothetically, some (real world and OGS) 5k who doesnât play a game for three months, may suddenly be ranked similarly to (real world) 15k (but now OGS 5k) due to drift, and worse yet, the 5k has an established rank, so theyâd have to âwreckâ lots of DDKs before finding themself with the correct strength again.
Youâd âfixâ the problem, but end up with basically a similar problem again due to not thinking about the math.
Regarding whether I expect drift to be large enough to cause the above; I donât, at least not on a relatively short time scale. But I think it should be tested and thought about first.
To recap: this is what weâre discussing. At least, if you mean âask new users if they are new, and assign them a suitable glicko ratingâ. Or do you have a different solution in mind, for what we do with self-identified new people?
The problem is purported rank pool drift.
As Vsotvep says, until someone âdoes the numbersâ (actual maths with actual realistic numbers) I donât think this proposal has a chance. Proving that rank-pool-drift will not be a problem requires an actual proof, not supposition or theoretical arguments. Similarly, assuming it will be a problem (because anoek did numbers and asserts that it will) any solution will need rigour in explaining how it avoids that problem.
I wish I could. Itâs only âmy recollection that @anoek describes that this is definitely a problemâ. I do feel reasonably confident that if he says it is, then thatâs well founded
What i noticed is a fear linked to some old experience in which many wrong declaration of levels were made like 1dan being misunderstood for being a beginnerâs level.
But the todayâs suggestion is different, itâs not to the player to give a rank, not concerning stronger players declaration either.
Itâs asking if you are a beginner and then OGS fix the starting rank. No need to know what a dan is.
We ask new players to pick between some options, e.g. new player, beginner, intermediate, advanced, Iâm a dan playerâbut this is just a cosmetic thing, it doesnât do anything with respect to the rating system.
Perhaps we can make this choice meaningful by then showing a message suggesting what to do (e.g. explain that 25k is weaker than 1k is weaker than 1d is weaker than 8d, etc)
We track the new accounts for a couple of weeks, look at those that have established a rank, and see what option leads to what average rank. E.g., on average ânew playersâ tend to be ranked 22k, beginners tend to be ranked 17k, intermediate players tend to be ranked 9k, etc. We can also track which people seem to be most problematic with regard to drift (e.g. dan players choosing âbeginnerâ, or something). The more data the better
Then we make an informed decision about what rank people will expect to land on when they choose those options on average. We can then actually implement that the account types start with a mean rank corresponding to the average given from the data.
If Iâm not mistaken, this should mitigate rank drift completely under the assumption that people will keep choosing the same buttons in the cosmetic and noncosmetic situations (although itâs hard to predict how many malicious air- and sandbaggers will abuse the system after we turn on the switch; the most troublesome are accounts that play a handful games in the wrong strength bracket and then abandon the account).
I am not sure itâs necessary to go so far.
The other extreme, the strongest players, may be less annoyed to play some easy games to get their real ranks.
At least it seems reasonable to have to prove your strength soâŚ
The system we have donât have to change for all the levels, i feel like itâs interesting and good working besides beginners.
Iâm mostly concerned by the lowest levels.