Depth and Design: Contrasting AI and Human Understandings
This talk was an invited lecture at the AAAI-18 Workshop on Knowledge Extraction from Games.
Slides
The slides have captions in many cases; hover over them when in a window to see them (if you click on the slides to see them larger, they won’t show up).
Video
Original essay notes
These notes were what eventually became the presentation. They include some dead ends, some stuff that I ended up leaving out altogether, and also have at the very end my first notes on a BNF grammar for games.
Suppose you take a simple game like flipping a coin. There is a rich set of mathematical stuff behind it, but it boils down to a binary outcome. Most games have rich stuff behind them, but boil down to binary outcomes.
Victories are generally measured at an end of an iteration process. One common victory condition is to cause the stop to the iteration process. This can take the form of blocking all opponent’s moves.
In the event that the iteration process is of fixed length and cannot be stopped, the point is naturally arrived at by iterating completely through whatever is used as the counter. For example, each player may have a fixed number of turns. “Best of three matches of rock paper scissors” is an example of this.
Or there may be a fixed number of spaces on which to place a counter, as in the game of Reversi. The game always proceeds through precisely the same number of turns.
Or each player may have a fixed number of tokens all of which may be placed.
Let’s term all of those bounded games, and that number of turns to be the bounds of the game.
Now, let’s generalize outwards. There may be some invisible or visible counting system, which we can call “victory points” or vp for convenience, and the game is bounded by reaching this arbitrary figure. Many games may bound at merely reaching one – as in Connect Four. In fact, the case of the bounded game is simply one where the number of victory points is determined in tandem with the bounds.
This then implies that games can have either binary or ternary outcomes, with the ternary choice being a tie.
In a two player game, the basic scenario works this way. Either the iterated rock-paper-scissors or Othello can both be expressed as the win condition being vp > bound/2. In the case of an even number for the boundary, ties are possible, with vpa = vpb = bound/2. In the case of an uneven number, there is no even split, and there are only two possible answers to the overall game process.
In the case of a three player game, these change to vp > bound / 3 and vpa = vpb = vpc = bound/3. We can generalize this to vp > bound / numplayers for a game with binary outcome; for each player, the result is either true or false. A three way tie in a game now demands that the boundary be a multiple of three.
We can then determine whether a game is binary or ternary simply: if the boundary is perfectly divisible by the number of players, it can reach a tie state and therefore return an indeterminate value, though of course many of the possible outcomes could include one of the possible players achieving a vp > bound/numplayers. If the boundary is not perfectly divisible, then it is impossible for it to reach a tie, and always outputs a binary result.
number of players | boundary | even split | ||
2 | 64 | 32 | ternary | ties possible |
3 | 64 | 21.33333 | binary | |
3 | 69 | 23 | ternary | ties possible |
4 | 64 | 16 | ternary | ties possible |
2 | 65 | 32.5 | binary |
For a binary game to support varied numbers of players, it must have bounds that are perfectly divisible by each number of players; e.g., either the bounds are different based on the number of players, or a common set of bounds is used regardless of number of players, and it is a figure that is perfectly divisible by all the possible numbers of players. For a binary game to work for both head to head and three player, the bounds must not be divisible by 2 or by 3, or else it will be binary in one case and ternary in the other and admit of ties.
2 | 1 | 0.5 | binary | |
2 | 1 | 0.5 | binary | |
3 | 1 | 0.333333 | binary | |
2 | 2 | 1 | ternary | ties possible |
3 | 2 | 0.666667 | binary | |
2 | 3 | 1.5 | binary | |
3 | 3 | 1 | ternary | ties possible |
2 | 4 | 2 | ternary | ties possible |
3 | 4 | 1.333333 | binary | |
2 | 5 | 2.5 | binary | |
3 | 5 | 1.666667 | binary | |
2 | 6 | 3 | ternary | ties possible |
3 | 6 | 2 | ternary | ties possible |
2 | 7 | 3.5 | binary | |
3 | 7 | 2.333333 | binary |
In some sense, games are generators of outcomes, distillers of processes back down to yes and no and (sometimes) maybe.
Since games are made out of games, it follows that each given subsystem in a game can be broken down into a binary or ternary outcome. And indeed, the idea of “solving” games via full searches of the entire possibility tree works this way. You can “unroll” all the choices that are made, by atomizing the game, and see each decision made as a step along a branching tree.
With some foreknowledge of the tree, a player (human or artificial) can make choices that are steps on the path towards a particular outcome, and can prune whole large swaths of the tree.
A given binary choice has implications not just in the current choice, but in affecting the odds of a given outcome later on.
If two players both always choose optimally on their turns because they have full knowledge of the tree, they have mapped the entirety of the possibility space. If the game is binary, it’s a guaranteed win for one player enforced by the structure of the game. If the game is ternary, it’s either a guaranteed win or a guaranteed tie.
In an unsolved game, the player has limited look-ahead. Each will always choose the branch that leads to maximizing the odds of arriving at a win state for that player.
This forcing down the wrong path can therefore be thought of as a parity problem… As players march down the tree, they gain or possibly lose victory points. Any choice after which their victory point total exceeds that of the other player puts them in a positive state; any choice which puts them below or at the other player’s total leaves them in a negative state. In a binary game, when the bounds are reached, if you are at 1, you win, and if you are zero, you lose. A ternary game permits both players to lose.
A win state isn’t necessarily the last leaf on the tree. If what is left below a leaf cannot alter the final parity outcome, then all subsequent choices are irrelevant.
In some games, given a zero-sum victory condition, each player playing optimally at every step will lead to a loss that is inbuilt in the structure of the system. This result can then be “rolled up” to the earliest point in the tree where we can see that outcome become determined. With perfect knowledge, we can “early out” at that point and the player who has been forced down the wrong path can concede. A game like tic-tac-toe is considered solved because the outcome for optimal players is determined.
In a game where victory points aren’t distributed evenly, a node in the tree may have branches below it where a later branch has many positive parity nodes and another branch has less positive parity nodes, because gaining a lead in vp’s can mean an unassailable positive parity.
A node as win, lose, tie, or indeterminate. Are tie and indeterminate the same thing really? If so, that means that indeterminacy at a node level is the same thing as a ternary game. Are there tactics or techniques for dealing with ternary situations? Odds of a tie fall with how the scoring system is constructed – the more VPs the lower the odds of a tie, right? Though depending on how they are allocated (bell curve?) it might flatten out?
Evaluating stuff lower down based on assessing odds of one branch leading to parity. Creating internal victory points, call them “position” as in good or bad position.
Multiple victory conditions effectively mean navigating multiple choice trees at once. Cognitive load caps your lookahead. This is why orthogonal rules are so valuable.
In some games, moves exist solely in order to swap parity positions with the other player. Pictyure a game where whoever is player 1 will win, because of a first mover advantage or some other determined setup. If a player can pass in such a way that they now take the place of player 1, and shift the win scenario onto them, that can be thought of as a parity shift.
Consider if the move made by an opponent is unpredictable (random, let’s say, perfect odds of choosing left or right). Model it out.
Pruning the tree based on topological symmetry, as in Othello or tic tac toe.
Players look at choices and select based on odds to win, not degree of victory. They therefore weight each node with percentage chances that are an aggregate of what is beyond.
Think about Set. There are a series of rules for building a set… all the same, all different, on three axes. Humans find certain types of congruency easier to see. All the same on any axis is easiest to match. Next easiest is sequence (1, 2, 3). Next is probably shading set. Shape set is hardest. Human thinking groups, rather than run brute force across a 12 card spread and all the possible algorithms, and most players preferentially start with that order. A player who focuses on shape difference sets first may well disrupt the strategies of other players by destroying the viable sets they have remaining in the deck.
If a computer were searching and had image recognition capability, it has equal capability to group things. It would have no preference in algorithm other than that which is hardcoded in its if statements, and it would simply match every card to every card using brute force. If competing with other computers or players of equal computation and perceptual ability, choice of which pattern to match against first is the differentiator. The strategy would lie in selecting an algorithm based on what would be more likely to reduce the types of sets that opponents search first, so that their computation speed on average tends to get longer. So if all computers search color first, the computer that searches for patterns that tend to reduce color commonality sets would gradually come out ahead. In general, a player who searches noncongruity first benefits from a noncongruent (high entropy) spread, and they can work to maximize noncongruent spreads via which sets they choose to call from those available.
If different computer opponents kept doing this, then to execute this strategy they need to ascertain what algorithms an opponent is favoring. This involves building a mental model of the opponent’s strategy – which requires greater computation, but may pay off overall.
An opponent who selects pattern preference randomly won’t be predictable in this way.
In practice, Set is played on a timer, so you don’t get to consciously engage in that strategy. Instead, the game likely simply favors those who practice perceiving noncongruity. But one can envision a turn-based Set without the timer element where strategy would arise from card-counting and odds calculating what spread you are leaving the opponent with.
We could think of depth as being size of the tree, but that’s false because any large bounds create that. We could think of depth as meaning keeping the game indeterminate until the last possible moment, but it’s easy to think of very shallow games that do that – plain old random outcomes at every point do. Instead, since it is a characteristic of choice, the metric has to be consequential choices. What makes a consequential choice? One that entails risk or tradeoffs; in other words, the number of indeterminate or tie nodes in the lookahead space?
Lantz et all suggest that d is a quality of the number of intermediate strategies, or heuristics short of optimality, that a player can discover as they proceed to complete mastery. A game you can play optimally or near optimally with a computationally cheap strategy lacks d. A game you can’t discover new strategies in will also lack d because it’s based on skill chaining or learning. In a sense, depth by their criteria is based on lemmas, on plateaus. But they don’t arrive at a way to actually measure strategy strength, much less how strategy is unveiled over time in the proper order. They rely on computational resources as an axis, but this doesn’t apply to human cognition in a regular linear way.
They make the point that heuristics rely on regularity within the tree structure. “Control the center” in chess is shorthand for “stay in this area of the state space.” A game with too much regularity, however, falls prey to degenerate strategy. A game with no regularity cannot have any heuristics. Therefore there is a sweet spot in terms of regularity on the nodes. They arrive at the conclusion that a mixed amount of regularity must therefore alternate regularly between heuristics and raw brute force search, and that is what conveys the feeling of depth.
Also see Danc’s comments at http://gamedesignadvance.com/?p=3124 and Frank’s note “Heuristics are ways of compressing and speeding up search. Every heuristic is a shortcut that reduces the amount of raw search you do. Intuitively, in a game with a high degree of depth, you would expect strategies in the middle ranges to have a balance of heuristics and raw search, because the places they do raw search are the places where they can be improved by additional heuristics.” So the ladder would be starting with raw search and little lookahead, then converting to heuristic, so you can more raw search. I think that logic is missing heuristic abandonment, which is crucial – dismantling a prior heuristic in order to build a new one.
Need toy model game with which to show that. Tiny Castle?
So how do you design for mixed regularity? What does mixed regularity look like? If we made a map of state spaces that showed clusters of configurations, and the high level board eval led us to see those as common groupings, we could probably abstract board configs into “higher-level games” or dense areas… places where many paths in the state space converge and come apart… almost like scale free network maps? Maps that fold over one another as signs of depth? Don’t unroll, teleport to congruent layouts?
This relates to this somehow. https://en.wikipedia.org/wiki/Parity_of_a_permutation
Is the stuff at the top wrong? Are games actually ALL ternary logic? https://en.wikipedia.org/wiki/Three-valued_logic After all, the Unknown value for truth is critical to the sense of depth. There are a LOT of permutations of the truth table for ternary logic which means over 16000 operators… http://wiki.c2.com/?ThreeValuedLogic
game → game + result
result → system( action ) = ( ( -1, [ 0,] 1 ) | 1 )
system → side + side { [, side] }
side → player + mechanic + statistic { , statistic }
player → ( mind + experience + body | AI ) + perception
mechanic → rule { , rule } + statistic { , statistic }
rule → token { , token } expression { , expression } statistic { , statistic }
action → verb [ { + action } ]
verb → intent + input
input → affordance + action [ + game ]