New proposed methodology for go server rank comparison

New Proposed Methodology For Go Server Rank Comparison

With the rise of a multitude of go platforms available for players to play online and the lack of a universal ranking system, the question on how go servers/association players compare to each other in terms of ‘player strength’ becomes more apparent. Several attempts have been made in past, but results are widely questioned by the userbase. The current research methodology hopes to address these concerns.

Why are the current studies questioned / flawed?

Past studies rely on either surveys or a naïve username based matching system to show the relationship between ranks on different servers.

Drawbacks of the survey based methodology:

  • Often suffer from a small amount of responses. Especially troublesome for harder to attain ranks or more unique combinations (e.g. KGS / Tygem pairs) to have a sufficient base to get reliable rank relation estimations.
  • Surveys take a long time to collect and rely heavily on people answering honestly.

Drawbacks of naïve username based matching:

  • Presence of false positive in matching sets. This is especially true for game handles that contain common English words, e.g.: Dream, Master, Winner
  • While it is common for people to use the same game handle on multiple servers, you’ll miss a lot of matches at the same time.

More importantly, next to issues addressed above, the current methodologies solely rely on the rank of the participant on the server at a given time and not the performance of player on that respective server. This is especially a concern on servers where people can select their initial rank. For example, a 5kyu OGS player, with 1000’s of games on OGS, decides to try the Fox Go Server, registers as a 5kyu and plays 10 games. In this scenario it is questionable to qualify the player as a 5kyu Fox player next to being a 5kyu OGS player.

New Methodology:

Basic idea: Calculate win percentages on game results between qualified (cross-server / association) players. If the win percentage is in balance (approx. between 48-52%) between cross server ranks and/or real world association ranks, they will be deemed roughly equal.

Qualified server player : Player has played more than 25 games in the last 3 months on a respective server in a specific rank (no rank up/downs).

Qualified association player: Player has played more than 10 OTB games in the past 12 months at a certain rank.

A player stays qualified for at least 3 months, and the qualification can be extended.

An example of the output I’m aiming for (fictional data):

With this methodology I place more weight on the actual games and the results than rank labels. I’m still debating to limit the amount of games per person to control for the results to be biased towards one or two persons in the set.

Prerequisites:

  • Substantial amount of cross-server player game handles / association pins (currently collected approximately 3.000 from various sources). I’ve put in a lot of time to collect all of these and to make sure the game handles etc. belong to the same person.
  • Game results for largest servers (collecting for Fox, IGS, OGS, KGS, Tygem, AGA and EGF results)
  • Time :blush:

I will use the old methodologies as a reference point.

I’d like to hear everyone thoughts on the methodology!

10 Likes

The strategy of “use better data” often yields better results, the hard part is to get enough good data points to be statistically significant :stuck_out_tongue_winking_eye:

It sounds like you’ve put in the time to collect these data points (3000??) so I look forward to seeing the results!

6 Likes

I feel like this is similar to some of the aims of another ongoing project. Maybe you could help each other?

3 Likes

Thanks. I’ve replied in that thread, would be happy to collaborate.

4 Likes

You write an argument against survey, and an argument against name matching.

Then instead of proposing an alternative method, you jump to your results:

So… What’s the method? Magic? How did you collect this data, if not by survey or name matching?

1 Like

The data collection in itself is not the methodology. The main concern that I address is the fact that both are just looking at the rank label and not the game results on that respective server. Sorry, I’ve should have made that more explicit.

Names are mostly derived from online tournaments that have taken place on either KGS, OGS or Fox and are also registered on official ranking website for European / American / Russian / etc federations. If you know the actual names of several people in those online tournaments, other can be found by looking at the results in combination with the date the matches where played on. Also there are often hints in the game handles of people.

Pandanet also features the European Team Championships and AGA City league which yielded a lot of game handles.

2 Likes

I’ll be testing out the methodology with some AGA/EGF/OGS data. Also talked with Anoek about it and got some help for the matching! Will post some first results soon.

5 Likes

Looking forward to it! The idea of cleaning the data and comparing only stable ranks really should bring down the deviation of the data.

2 Likes

First analysis this weekend. Slightly dissappointed by the number of games that are part of my analysis after various cleaning measures, about 40k from about 1.000 EGF members with a OGS game handle. Don’t want to post intermediate results that are still suspectible to change, so it will take a little longer.

Especially some interesting stuff happening around 1 dan OGS with some impressive win %.

3 Likes

I’m 100% would vote for intermediate results. We are here to help and polish the rank analysis approach, not to buy a product. Looking forward to the first publication of results here!

3 Likes