Well, yes, that’s why I called it an extremely rough rule of thumb.
It might be true for beginners just barely being able to score themselves, and almost always go into scoring, without the ability to estimate the board to resign (or even aware of resignation). The trouble would be to find beginners of the same level to have games like random to play with. And generate a super wide range of game outcomes. However, in practice, they almost always play against stronger players, causing uneven scoring tendency.
In the post you are replying to, I addressed this “equal resources” situation. Your explanation here is a very good illustration of the reasoning I was referring to. (And I wonder if it continues to hold for the rare cases where the difference would be more than just 1 stone.)
However, as I said, this assumes that winning the ability to play the extra stone isn’t a matter of skill and that the sequences that lead to that extra stone shouldn’t be miai with the ones that don’t.
In my previous post, I also explained the closest analog with area scoring: filling the board in until there was no where in your controlled area where your opponent could legally play. If we start from your “always about territory” example, and keep going, including rearranging the board to give each player 1 continuous block (since this does not change the area score if you are careful) , then you eventually end up with a blob of stones and two single point eyes. So, in that sense, the game is “only” about the stones and we just count those 2 eyes as points for accounting convenience and to avoid having to actually rearrange the board or track the 2 points manually.
Is this not the same sort of reasoning leading to a different conclusion?
Am I missing something critical about your argument?
Also, if the AI analysis reported above is to be believed, area with a komi of 7 is theoretically correct in that it is the version of the rules that creates a drawn game with equal winning chances. So, at least in theory, that extra stone is a matter of skill.
Since with territory, making the komi produce a drawn game does not result in fair chances for both sides and you have to give an extra half point to do that. (Though, as I pointed out, for human players this is actually a strong argument for using territory since it gives a game with equal chances despite having fewer draws. And I’m unsure how to interpret this in light of game theory. Since solving the game should still result in either a known winner or a draw.)
As for the AI positional analysis, of course it will be less granular, and I thought I already addressed this. Let me know if I’m being unclear again.
If we take “area” as being correct then of course the evaluation, which is estimating the final score will give more moves equal value. Using territory rules is adding in an arbitrary preference for one set of otherwise miai plays over another set. So it is creating that artificial xor.
With regard to the area estimation still being in 1 point increments, I think I discussed that too, the argument that it is “too hard” to know the difference with that level of precision until later in the game. So the algorithm estimates an average.
If this imprecision continues through the midgame and into the end game, then if the evaluation is meaningfully more accurate, the uncertainty of the AI’s evaluation should be less under territory than under area and the estimation it gives should be a better estimate of the final score. I.e. If you treat the evaluation as a prediction for the final score, that prediction should be an unbiased, lower variance estimate and it should converging to the correct final score sooner than it does under area. Put another way, if these things are true, territory would make optimal play and evaluation easier and thus lead to sharper play for the entire game.
Otherwise, it is giving the estimate as 1 point because that’s the best you can get and “it could go either way from a tied position” is all the bot can manage (which is what you expect from a tied game with limiting drawing chances).
Does this make sense?
That’s all testable in theory, though I don’t know if anyone has done it And given that the training for the for the bots using different scoring methods is often shared to some extent, it isn’t clear to me that the current AI can be properly used for this with an appropriate level of precision. (Maybe someone more expert can comment.)
And, FWIW, since “territory” bots uses a modified version of Ikeda territory I, it might not be unreasonable to claim that the approximation isn’t accurate enough to correctly capture what is intended since it eliminates the “hypothetical play considering starts with either player” aspect. I don’t think this should matter outside of positions that happen maybe in 1/500 games, but since we are already dealing with differences that matter in 1/50 games, it could actually impact things. Hard to say without trying it.
The ultimate board shows why it isn’t: if you currently don’t have the ability to play the extra stone, then you can only win that by ANTI-skill - by making a one point mistake. Also:
It doesn’t matter if you call your units territory or future/potential stone places - what matters is to distinguish between truly conquered points (worth 1 in all circumstances) from the illusory point (the 0x01 mask) that may or may not get added to B’s score after competitive play stops (worth 0.5 at most).
Think about W’s 9x9 opening moves example again. Why do you think the territory analysis makes a better evaluation for the true value of those moves than the area analysis - EVEN if we play under area scoring? By making a 1 point territory mistake, the minimax area result have not changed yet (as shown by the area analysis) yet we already lost 1 point of - practically measurable - expected/average AREA value?
The reason for this is that - as far as non-perfect players, ie. the whole distribution is concerned - a position from where B needs the extra stone to reach B+7 area is truly inferior to one from where he doesn’t, even for area scoring. Another way to realize this is by playing on several boards in parallel (placing 1 stone on 1 chosen board each move).
This is because komi scales with skill, with absolute beginners needing about 50% the value that perfect players need (since they make less use of the half-move advantage - you can find discussion of this on L19 with Bill Spight and others). The komi for perfect territory play on 19x19 is likely 7, which naturally means that for humans a bit smaller value will be closer to 50/50.
There is a discussion about rulesets and scoring systems every few months here but I haven’t read this one before. That’s actually a valid point. The thing is that it could simply be fixed by awarding W an extra point if he played a stone less than B. What cannot simply be fixed is that L&D decisions, which obviously are essential to the game of Baduk, cannot (always) be determined during the regular game when using territory scoring which is not easy to fix and a complete game breaking issue.
Therefore area scoring is the answer.
I don’t understand your argument that this is “antiskill” and a 1 point mistake. Nor do I see how that follows from your linked example.
There’s some assumption we aren’t sharing here.
Again, I don’t understand why the extra stone black sometimes gets to place isn’t “truly conquered”. It’s exclusively his at the end. And I don’t see why the route it took to get there ought to matter. Or why, in a game that scores based on area, you would think that the “mask” is odd at all. That’s the granularity at which the game is played.
No one asks why soccer does not track fractional goals because that would be strange.
If you look at the reported numbers on the page you linked, territory has higher uncertainty both in win % and score. Compare that with what I said in my last post that we would expect to see if territory had the properties you claim.
It does not seem to actually do that. And I don’t think I understand your argument that it gives a better value for the moves with the territory evaluation. It says the moves have less certainty. Doesn’t that make them worse?
Is there some reasoning step I’m missing?
Review what was said above about how under area fair and perfect komi are the same, but under territory they are not. I think you misunderstood what I was saying. This isn’t about playing strength but about a theoretically correct number that does all the things game theory says it should if that number is “right”.
I don’t have a problem with them being different in territory, but it does indicate that something is “wrong”.
I don’t follow this argument but am open to hearing more.
Because that point is still subject to further conditions (such as the parity of both players mistakes until we get to the dame stage, or that we do not play on multiple boards in parallel - see below).
I think so. The territory analysis shows a few W moves as perfect (~0 loss), and another few as ~1 point loss. The area analysis shows both of these groups as perfect (~0 loss) - since we are in a parity situation where a W mistake (losing 1 pt of territory) will get canceled by area rounding. This is true in a minimax sense, so for a near-perfect player the area analysis is correct.
However, if humans would play a lot of AREA scored games starting from here with a territorially perfect W move, and a lot starting with a territorially 1pt-mistake W move, the distribution of area scores would still show that 1 pt loss or difference in average area value (and a worse win%). Those W moves in question that the area analysis shown as perfect (no point loss) DID lose 1 AREA point in a distributed sense - just the area score prediction of the area analysis couldn’t express this fact numerically.
On 7x7, 8x8, 9x9, where bot play is very close to optimal, the tree of likely-optimal openings in area scoring in each case does appear to be much bushier than the tree of likely-optimal openings in territory scoring.
It also tends to be the case that the tree of optimal area scoring openings for both players is a near-superset of the tree of optimal territory scoring openings, consistent with the idea that territory scoring is “sharper” than area scoring. For example, on 9x9, the vast majority of openings fall somewhere into the following series of buckets if both players play perfectly after that opening:
…
- I - Before komi, Black will be ahead by 4 points territory, and 5 points area
- J - Before komi, Black will be ahead by 5 points territory, and 5 points area
- K - Before komi, Black will be ahead by 6 points territory, and 7 points area
- L - Before komi, Black will be ahead by 7 points territory, and 7 points area
- M - Before komi, Black will be ahead by 8 points territory, and 9 points area
- N - Before komi, Black will be ahead by 9 points territory, and 9 points area
…
(with more extreme buckets in the same series extending further in both directions)
The starting 9x9 empty position itself as best we can tell, falls into bucket K, leading to a likely optimal-player komi of 6 under territory rules and 7 under area rules. With those komis, every opening line that stays in bucket K continuously is optimal under territory rules, but opening lines that traverse back and forth between and end in either of K and L are optimal under area rules, hence the area scoring optimal tree being both much bushier and nearly a superset of the territory tree.
A small fraction of opening lines break the above pattern, resulting in a few lines optimal under territory scoring but not under area scoring. This could happen with unusual sekis, but empirically on small boards 7x7 through 9x9 where dame points can run out rather quickly, the more common reason instead appears to be winning a ko post-dame (link to a detailed example on 7x7: KataGo Opening Books - 7x7 Highlights and Discoveries).
As a direct consequence of territory scoring being “sharper” as discussed above, one would tend to expect the opposite to be true for a decent number of positions when a bot is near optimal but not entirely optimal. For example, there are plenty of openings where the bot may be very sure the position belongs to either K or L and not any more extreme bucket, but has some uncertainty as to which of K or L exactly. For such positions, the bot will be very confident about the final score and winner under area scoring, but much less confident about the final score and winner under territory scoring, because K and L have the same outcome under area scoring but differ in territory scoring.
Yep! The win% uncertainty being higher is expected because territory scoring discriminating more finely between outcomes and having fewer optimal moves that one has still has to consistently find in order to maintain a draw means that misjudgments and blunders are much easier. The score uncertainty being higher isn’t as automatically expected. It might depend on which buckets more uncertainty exists between for all the future branches that need to be read out (e.g. uncertainty mostly on K/L means territory score uncertainty is higher, uncertainty mostly on L/M would mean area score uncertainty is higher).
Just out of curiosity which model do you use to test the 9x9 positions? There is a refined 9x9 network that is specially trained with 9x9 size. Do you test with it? IIRC, it doesn’t output the same winrate as the general model, and its score prediction is more refined and very sensitive to komi setting.
This makes sense now. Thank you!
Do you know whether something similar holds under area scoring with a parity correction or similar dynamic komi rules?
The concern I have is that this precision is effectively arbitrary and that the uncertainty is being generated by virtue of favoring one, otherwise equivalent, territory parity outcome over the other. And in a vacuum, moving points around based on parity of a portion of the (area) score seems inherently “off” and uncertainty-generating.
Is there a way to use this data to rule out that concern?
Because it seems like you would have a similar result if you injected some random noise that would arbitrarily favor some fraction of the moves at each step.
Do you know if these properties hold under area with a parity correction for 6.5 / 7.5? If so, that would resolve the discussion in this thread quite cleanly.
Because the lack of match between perfect and fair enough komi under territory is a large part of why I’m skeptical of the extra “precision”.
Alternatively, is there a good explanation for what it means that fair != perfect under territory?
I’m interpreting that, in light of knowledge that a perfect information game is either drawn or solved for one side, to conclude that territory is doing something “wrong” and we just don’t know what. Is that reasonable? Or is there some better explanation?
I’m not sure I understand how you are reaching that conclusion on the basis of the two pages you linked. When I requoted them, did I go to the right place? If so, can you walk me through how you reached this conclusion?
Yes I trained that 9x9 net specifically for the purpose of generating a better 9x9 book, since an earlier attempt at generating a 9x9 book with general networks had difficulty in some variations where the nets were misjudging things.
what do you mean by “parity correction”?
I had meant to ask this earlier. What is your opinion on two button go? AGA rules (pass stones included) , territory scoring, except the game ends in two passes, and if the same player passed both first and last, they get an extra point?
The issue with the Taiwan rule is that you have to define “last competitive play” and that is complicated because of teirre and probably other situations I’m not thinking of.
Does this variation of button go fix your issues with normal button / first pass rules? Or would we need to directly adjust based on the territory parity after fill-in to get the results you want? (I am also concerned about potential corner cases where for whatever reason, the parity is off because one player needed to make extra plays. E.g., perhaps there is some situation where resolving a dispute about life and death forces black to make an extra play they otherwise would not have made, thus changing the final parity.)
So, I’m unsure how that should be handled without complication if you are directly looking at the final parity instead of the parity after the first stoppage.
Button go. Or “first player to pass gets a point”. Or “player who makes last competitive move deducts 1 point”. All make area scoring as fine grained as territory. By distinguishing the cases that area scoring normally does not.
I’m wondering if those rules maintain fair = perfect.
That just shifts the score difference before komi barring seki from a guaranteed odd number, to a guaranteed even number
I don’t think so. See (Rules and Area and Territory Scoring at Sensei's Library) and other pages linked from there.
That page has several rules, to which were you referring? They generally seem to avoid draws, which makes Fair = Perfect Komi impossible as no Fair Komi exists
If your definition of a one point button is used, my objection holds, if you use a half point button, then neither color is favored by it, and thus Perfect Komi must remain the same (7.0), but all scores will end with .5, which means that Fair Komi must also end in .5 to allow draws, but that can’t equal 7.0
Focusing on these two buckets that positions can fall into given optimal play:
- K - Before komi, Black will be ahead by 6 points territory, and 7 points area
- L - Before komi, Black will be ahead by 7 points territory, and 7 points area
The positions that fall into K are generally the ones where “Black can achieve 7 points area, but will finish the game in gote” (including finishing the dame, because dame are gainful moves worth points in area scoring), whereas the positions that fall into L are generally the ones where “Black can achieve 7 points area, but will finish the game in sente”.
(Of course, post-dame ko resolution, and unusual sekis, and other beastly positions still mean there are exceptional differences between rulesets that don’t fit into the above pattern, this is just talking about the “common” case that the bulk of games fall into)
So there’s a reasonable sense in which the territory precision isn’t an entirely arbitrary or random thing. If black can achieve the same area result but do so more efficiently, in one fewer total move in the game, then black will do better under territory scoring but area scoring won’t recognize the difference. That’s where the extra precision comes from. I do think it is more often the case that under area scoring than under territory scoring than the other way around that one can play a locally “slack” move that is one move less efficient locally in getting all the points you would have gotten, but still achieve the same whole-board result because if you would have ended the whole game in sente before, now you can still end the whole game in gote using one extra move to fill the dame you could have claimed more efficiently initially. (And in the common case, the so-called “slackness” of the move will also be identifiable by noting that it is one point worse in territory).
Yes, Button Go and other variations on it (Button Go at Sensei's Library) are an attempt at discriminating “achieved the same area score ending in gote” and “achieved the same area score ending in sente” directly rather than via the concept of territory, by using area scoring and rewarding a half-point to the player who can finish in sente. This achieves the extra scoring precision in the same way that territory scoring does, but as usual a variety of exceptional positions and rules beasts mean that like any other rulesets, this one too also isn’t exactly equivalent to others and occasionally has its own distinct cases.
Territory scoring, or button Go also is not the final endpoint in terms of this kind of “precision”, although either of these or area scoring are sensible stopping points because going further starts to get pretty “unnatural”-feeling. But if one were to introduce a stack of buttons with values 15, 14.75. 14.5, 14.25,… values spaced every quarter point, such that on each turn one could either make a move on the board, or take a button, then I believe the way the math/game theory works out, the game would tend to be become even yet sharper, with some moves that are “on average” slightly point losing but not always punished under territory scoring now being further discriminated. (In particular, moves who differ in cgt-style value on the order of a half to a quarter point would be more often discriminated rather than sometimes leading to the same final score). As there is no limit to how fractional such values can get, one can keep making the game more precise by having finer and finer button increments.
Does that help? (Also feel free to correct me if I made a mistake, getting into this level of detail I might have overlooked something).
There are multiple, mostly equivalent options, Taiwan rule, button go (worth half a point), two button go (first and last past both compensated), first pass, etc.
The result is that in half the games, komi is 7.5 and in half it is 6.5. So, there are two komi. And I’m wondering if, for that rule set, is that pair of komi both perfect and fair. Yes, there will be even fewer draws, but within the draws allowed by those rules, though ought to be the ones that maximize them given that there is now a move that is worth half a point. And because it is dynamic, they should keep the game fair as well.
But this is theory, I’m wondering if anyone knows whether this has been AI tested.
Ah. This is insightful. Thank you for the explanation.
Is there some concensus on the “least bad” version of this? Has anyone done extensive AI examinations of any of them?
Intuitively, it seems like 2 button comes into play less often, and so would introduce fewer oddities by virtue of that. But it’s also hard to think of situations where it comes up at all, so maybe there’s something where it is grossly wrong.
And I’m aware of a version of Ikeda Territory using 3 passes to end each phase which seems to avoid most of the rules beasts those type of rule sets run into (it’s the 5th commentary on Jaisik’s website). Perhaps that’s what we should be using since it seems pretty close to what territory mode in Katago actually does. But I’ve never really been a fan of these multiphase rules because then you have to learn what things need doing in which phase. Though if it’s fairly close to actual territory rules except in cases that are already complex for territory rules, then maybe that’s not the end of the world.
I think that it’s worth reminding that the original post was asking some pretty basic questions about how to score a game, such as whether a dead stone in territory counted as one or two points, and how life and death disputes are resolved.
In the context of helping beginners to understand the game, I think that the scoring granularity advantages of territory scoring is largely an irrelevant issue and vastly outweighed by the complexities of Japanese (or Korean) rules. Ultimately, beginners that are supposedly taught to play under the Japanese rules are instead given simplified approximations of those rules, which often leads to confusion later on. Look at how many players expect Japanese rules to allow simply playing out any life/death disputes, while being unaware that there are special ko rules that apply for life/death resolution.
One often claimed advantage of teaching under the Japanese rules is the supposed emphasis on territory (minus prisoners) as being the important deciding factor. However, this objective is not essentially changed by whether one uses area or territory scoring. It’s also possible to teach how the area score basically boils down territory minus prisoners, when one considers how both players usually play almost the same number of stones on the board.
It’s also not simple dichotomy between just territory versus area scoring either. As already discussed, button go is one way to have finer scoring granularity, while keeping the simpler life/death resolutions rules of area scoring. Another way is to adopt Lasker-Maas rules.