Human-like KataGo v1.15.x models real strength?

epanizon · November 27, 2025, 11:09am

Hi!

I have recently been playing against the b18c384nbt-humanv0 models, with different strengths (say 20k-10k). I was wondering if some-one has tried to verify their nominal rank.

I am asking because I started playing GO quite recently (few months) and by my last games I evaluate my rank to be something aroung 20k-18k, but it seems that I can routinely win the human-like model even at quite stronger levels (as low as 10k).
I guarantee you I will not win against a 10k player

(Or maybe I am messing up some config?)

epanizon · November 28, 2025, 12:41pm

(Just as examples to better showcase my point, these are the first 50 moves of a game which I went on winning ((and forgot to save))… Human (Normal Game)ㅤ vs. AI (Human-like)ㅤ or this one Human (Normal Game)ㅤ vs. AI (Human-like)ㅤ where I think I played a couple of branches at the end but until move 200 it was a real game )

rebornz · November 29, 2025, 3:29am

yeah i think something is off with the human-like ai.

richyfourtytwo · December 1, 2025, 5:03pm

Always hard to estimate playing strength from just looking at a few games, but your moves do not look 20 kyu-ish to me.

epanizon · December 2, 2025, 9:07am

Thanks!

I hope I am getting better than my current rank shows (and I don’t play many ranked games on OGS), for sure my game-losing mistakes still look 20kyus

The point is that even then the AI does not look 10kyu to me

Counting_Zenist · December 2, 2025, 11:11am

Supervised trained human-like models output policy are statistical distributions of the training data, and since the peak dateset centered around 5k, there is a lack of data as the ranks goes down, and they had to be “extrapolated” when there are a lack of datapoints. And since as the ranks in DDK, tend to have wildly differ opening and fuseki patterns, the statistical distribution of opening moves are almost random like or very slow, since the extrapolation for the follow-up to the opening would be “easier” to “emulate”. That is the supervised trained models tend to have very weak opening, and early games, but regression to norm in the mid-to-end games (very few can be extrapolated for mid-games or end-games). Hence a very uneven quality of moves below SDK levels. And they tend to be disjointed in their direction of play but somewhat decent in main line joseki (since the most common patterns they will share).

epanizon · December 3, 2025, 9:50am

Many thanks for the detailed answer! It make quite sense now