First of all I love OGS. Great web app, nice community and the place where I play Go the most!
I would like to use this thread to talk about my biggest issue with OGS: the match making / rating system. There have been threads about this before on this page and also on reddit. I’m not sure if it is ok for me to reopen an argument that has been shut down before but I still do not see why the logic is not reworked.
The issue I see with the OGS overall rank is that it is not just shown as a summary rank to kind of compare overall strengths of players but is indeed used for all match making in all subcategories.
This means that my 7.1k ranking as a correspondence 13x13 player is conflated with my 12.8k ranking in normal time 19x19 games. We are talking about a skill difference of 6k in these play styles. This is very significant. Because of this (when playing normal time 19x19 games) I will continuously be matched with opponents that are to strong for me.
There is an easy but not great workaround for this: creating an account for each play style. Until now I’ve already created eight OGS accounts for this reason. The hassle for this these days is not too big because of password managers and containerized tabs but I’d still very much prefer to be able to just use one single account.
Regarding comments made in the linked thread
[…] keep saying our ranks are meaningless without offering a shred of evidence.
There is no need to be overly aggressive here. The OGS ranks are definitely not “meaningless”. But they are mixing play styles that significantly differ from each other. You could also factor my StarCraft or Basketball skills into the rank calculation but the more games are mixed together the less meaningful the rank becomes. The evidence for this can be seen when looking at my (most used) OGS account. The system just does not work for people like me. I’ve been significantly better on the smaller boards since I’ve started playing Go 2 years ago. This might some day change but 2 years is a pretty long time for a ranking system to not catch up. The purpose of the system is to match me with the best suitable opponent for all of my games and this goal is not attained using the current system.
Meanwhile, someone actually did measurements and found that using combined ranks predicts match outcomes better than separate. This is the ultimate evidence for a ranking system working.
Some guy in the very same thread pointed out that the data for the analysis was apparently very flawed. I’d personally like to see the data and methodology myself because I do not believe this is true.
the game of go remains the game of go, no matter the size of the board
That’s like saying the sport of running remains the sport of running, no matter the distance
Testing were made by @anoek himself, and I think reported in this thread
there are dozens of threads and opening again the subject without bringing some novelties may be unproductive. maybe first check first all that was written on this?
Maybe post this critic in the thread with the data. Would be much more productive
Seems you found a good way to manage your matchmaking with multiple accounts. Even if in some cases using the global ranking may create some distortion, remodelling the system is not a small work and would come in some priorities list. (And seems low now)
there are dozens of threads and opening again the subject without bringing some novelties may be unproductive
It seems like many players have an issue with the current match making logic. The novelty in my post is the prove that the system does not work for me. One of the developers mentioned that there is not a “shred of evidence” that the current system does not work. This person could have a look into my OGS account. As everyone can see I will be matched as a 9k player this means my 13x13 opponents tend to be a little to week while my 19x19 opponents tend to be way too strong.
Seems you found a good way to manage your matchmaking with multiple accounts.
What I found is a workaround
Even if in some cases using the global ranking may create some distortion, remodelling the system is not a small work and would come in some priorities list. (And seems low now)
The reason I opened this thread is first and foremost to come to an agreement that the current system is not ideal. This work will not be prioritized or implemented at all if the general consensus is that the current system is better or equal to the proposed alternative.
The thing is I work in data myself. I could do an analysis on this myself to show you that I’m right. But I don’t really have the time for that next to my other projects and work right now. So if the obvious but anecdotal evidence I provided would be enough to convince everyone that would just be great. If not then I will post my own analysis sometime in the future to make everyone happy.
Yeah, the rating tables do look bit confusing, espesally when there isnt all that much data behind those numbers.
That 6 stone difference does look weird at first glance, but you have played only 9 ranked live 19x19 games and won just one of those against a 10k, so there isn much data for rating system to work with
The table starts making more sense when theres more games on each category, but if you almost never play blitz or live, then theres no need to look at those numbers. Seeing that those are grayed out or have large ± value is indicating that theres not much of those games on your history.
(imo the current " ± 2.5" is not very effective way to emphasise how tiny sample of those games…
and gray " ± 4.9" is very weird way of saying “no frikking idea” xD )
That 6 stone difference is not weird. I’m just way more used to play on 13x13 boards. I played 10 games (on 19x19) and the ranking system did not adjust one bit because it conflates all categories. That, in my humble opinion, is not how a ranking system should work.
The table itself is great. That is not my point. My point is that the table is not actually applied for the match making.
Of course. Good point. So this is another account I use solely for normal time 19x19 games. As you can see, it took the system only 6 games to assign me the appropriate rank of low 10k / high 11k. This rank makes sense for my 19x19 games. I had a great experience on that account with 8 wins and 5 loses. That is peak match making. So the algorithm itself works. It is just not implemented correctly. Because even though my rank for these types of games is 10k-11k, I get matched with players that are appropriate for 7k-8k strength (when using my main account).
I encountered this issue repeatedly when discussing with players who had learned Go on 9x9 (or sometimes 13x13), and practiced until they had decent reading and tactical knowledge, but were inevitably ran over when trying 19x19 since they had no notion of strategy at all for this size (so their 19x19 rank should have been much lower).
So, the difference in rating when comparing 13x13 correspondence vs 19x19 live is 2.1 stones. It doesn’t seem that big, but I do understand it can feel uncomfortable while playing. It changes the mindset.
So I wanted to conduct my own analysis this evening but it seems that I can only get the players overall rating using the public API. Can anyone help me with this? Not sure which call gives me all ratings of a player (i.e. the whole rating table + the overall rating, so 10 ratings total) for a specific match.