Diplomatic Go

yebellz · September 15, 2020, 3:18pm

I think these are interesting ideas, and I definitely want us to explore these questions further, especially for potential adjustments in future games.

I would like to comment with more detail about incentives and strategic options, but I don’t want to personally influence the decision making in the current game. So, I’m purposely holding back on some thoughts for until the game is over.

Instead, I would like to mainly focus on some general game theoretic discussion and comment on the proposed variations which are quite different from the current game.

However, first let me just remark on this:

That specific comment quoted from me in the rules is more of a strategic clarification/interpretation. Ultimately, however, the players can try to shape the outcome of the game however they wish, and their personal preferences might be more fine grained (and based on many things that transpire during the game) than the simple distinctions between win, loss, and draw as defined in the rules. I do not intend to enforce any sort of prohibition against “playing for a strong second” or “throwing a game”, and such a thing cannot be objectively done anyways.

The concept of playing for as large of a score as possible vs playing for the largest score significantly changes the nature of the game. Analogously, standard Go is also drastically changed by adjusting the objective from maximizing probability of victory to maximizing score. Of course, human players do ultimately seem to have some preference toward maximizing their score (in standard Go), but this is typically a secondary and unofficial objective. If one actually changes the objective in standard Go, then considerations about risk and strategy are profoundly changed.

For example, with the standard Go rules, if one is uncertain about the security of a local position, but confident of having a substantial lead, then one might play a slack but safer move in order to feel more secure about preserving the lead while even ceding some of it. In another example, if one is trailing in expected score, standard Go incentivizes bold, even risky, plays that aim to take the lead, even if such moves might come with an increased risk of falling behind even further. Changing the objective of a two-player game to maximization of score greatly impacts the strategic considerations in situations like these examples.

Stone scoring introduces yet another wrinkle with the concept of group tax, which adds an incentive to create fewer and larger groups, and to fill in as many eyes as possible. However, with the multiplayer angle of diplomatic go, where life and death is affected by the possibility of multiple eyes being filled simultaneously, the consideration of how many eyes to leave in a group (which impacts stone scoring) is affected by diplomatic considerations. For example, a player with a large group with X single-point eyes might wish to fill in more of those eyes to gain more points, but has to consider whether if doing so changes the life and death status of that group, which ultimately depends on the other player’s willingness to attack a potentially vulnerable group. If each player’s own preference is to only maximize their own score (and not care about any other player’s score), then maybe there is no incentive to help try to kill a particular group (that is not adjacent to any of their own), and maybe such a hostile action might be diplomatically discouraged due to concerns of damaging retaliation.

Regarding payout design and general game theoretic considerations: A game might assign numerical payouts to various outcomes, but it is important to note that such payouts do not necessarily correspond to the utility functions that are implicitly defined by player preferences (even if those preferences generally correspond to the objective of maximizing one’s payout). The expected utility perspective of rational behavior requires knowledge of the utility function in order to understand/predict behavior, however, such behavior depends on the individual’s risk tolerance/aversion, which is not just a function of expected payout, but also depends on higher-order moments (like variance) and, generally, the entire distribution over the payouts (when dealing with the potential uncertainties at an intermediate stage of the game). A beautiful foundational result in game theory establishes that (under mild assumptions of player’s preferences being self-consistence and non-pathological) a utility function does indeed exists such that the maximization of expected utility (with respect to this function) captures all of the nuances (including risk) inherent to a player’s preferences. However, explicitly determining this utility function from a player’s individual preferences is not at all obvious, and generally impossible without a deep understanding of those preferences. Such preferences might also depend on various externalities and psychological factors that arise from interaction with other players as well. Thus, while a game can establish payouts for various outcomes, which will have an impact on player’s preferences and their implicit utility functions, it cannot hope to fully dictate those utility functions.