Encoding tic-tac-toe into integers
Practice challenge: design an encoding for tic-tac-toe into a number.
Ok so all data on (non-quantum) computers is stored in binary, which can be converted to and from integers like so:
binary string: 10101 integer: 2 ** 4 * 1 + 2 ** 3 * 0 + 2 ** 2 * 1 + 2 * 0 + 2 ** 1 = 16 + 4 + 1 = 21
A common sort of coding question for an interview is to model a game like tic-tac-toe. With its 3x3 game board, it's common to do a nested sort of 2d array:
XOX OOX XXO
=>
X = 'X' O = 'O' game = [ [X, O, X], [O, O, X], [X, X, O], ]
And while you can always serialize any data structure into a string, or just take a compact string representation like "XOXOOXXXO" as you'd read english text, let's come up with a more compact way of representing this game.
You might think to do it as a binary string with X being 1 and O being 0, but there's another possibility: a square can be empty. For example the starting game board could be represented by 9 blanks, like " ".
For a trivial reference point, let's do as an exercise the encoding using the standard ascii character set, using the characters X, O, and space.
In ascii, characters range from 0 to 127: they are stored in 7 bits. For example we have from https://en.wikipedia.org/wiki/ASCII:
Space is decimal 32: 010000 X is decimal 88: 1011000 O is decimal 79: 1001111
Let's introduce a space into our sample board game for variety. Then we have the following board:
OOX XO XX
"X wins!"
encoded as: 1001111 1001111 1011000 010000 1011000 1001111 1011000 1011000 010000
Seven bytes for each of the 9 spaces, so we have: 7 * 9 = 63 bits, or 8 bytes with 1 bit left over. That's actually extremely compact... on a 64-bit computer such as this, the entire game state would fit in a single [word https://en.wikipedia.org/wiki/Word_(computer_architecture)]. So compacting any further is totally extra... let's do it anyways.
Since we know the game values can either be empty, X, or O, we know a square-by-square representation needs to have at least 2 bits to encode each value: a single bit couldn't distinguish between the 3 values.
2 possible equivalent encodings fall out of this. We can start at 0 with empty, 1 for X (I take X to go first) and 2 for O. In binary then:
EMPTY = 00 X = 01 O = 10
We then have the undefined value 11, not to be used!!!
Equivalently, you could think of this as a 2-bit array, where the first element is the value of O and the second element is the value of X. Then the value 11 could be interpreted as both X and O being filled in a square... again this behavior is undefined but it gives you an idea for the kind of glitches that can fall out of game programming.
Then we have the much-more-compact representation of the "X wins!" game as follows (here it is again for reference):
OOX XO XX
10 10 01 00 01 10 01 01 00
This comes in at a compact 2 * 9 = 18 bits, or 3 bytes with 6 bits unused.
But can we do better? We already saw this encoding allows for the undefined value 11, which means 25% of our possible values are unused.
One thing to consider is symmetry. A 9x9 grid can be rotated 4 ways, so we could theoretically come up with a canonical rotation and then use 2 bits to store which of the 4 orientations this canonical representation is in. For example, the board with a single X:
X__ ___ ___
is similar to
___ ___ __X
and we could say this x-in-the-lower-right-corner is the first one rotated 2 times. This doesn't achieve anything actually, since we cut down on board representations by a factor of 4, but need 2 bits to encode each of the 4 rotations. And if we have an empty board or a board with an X in the middle, we'd still be storing its rotation, though it doesn't matter the rotation, they are all the same board... so that doesn't get us anywhere in density, this kind of isomorphism can be useful for building game logic: since the rotation of the game doesn't affect the strategy, rotating the board into a canonical representation can cut down on duplicity.
Thinking about the game for a second: the first move can be thought of either being center, edge, or corner. Then, with the rotation, as we discussed above, we can rotate the game so the move is always the top left corner, or the top edge, or the center (which doesn't care about rotation). But the player might not like it if they click in the bottom left and their move goes to the top left, even if we argue it's isomorphic it's not what the player wanted...
Another place we might be able to achieve information density is the fact that some board states are unreachable. For example:
XXX XXX XXX
can never happen, as once there is a line vertically, horizontally, or diagonally, the game ends.
Back to the rotational symmetry, we could imagine the "X wins!" game being logged as follows:
GAME START X CENTER O SIDE ROTATION 1 X SIDE ROTATION 0 O SIDE ROTATION 2 X CORNER ROTATION 1 O CORNER ROTATION 0 # bad move! X CORNER ROTATION 3 # winner!
Since the order of play is static, we can leave this out of the log. Also rather than using a 4-bit rotation, let's use more descriptive words. This game is then:
GAME START CENTER RIGHT SIDE TOP SIDE BOTTOM SIDE TOP RIGHT CORNER TOP LEFT CORNER BOTTOM LEFT CORNER
As we can see, the sides only take one descriptor (there are 4 of them) and the corners take 2 descriptors (top or bottom, left or right). So this game is unambiguous (let's also abbreviate GAME START to START)
START CENTER RIGHT TOP BOTTOM TOP RIGHT TOP LEFT BOTTOM LEFT
Now let's take out START from our "language" by allowing games to be "blank": "" meaning no moves played, and take out spaces. This game is then:
CENTER RIGHT TOP BOTTOM TOPRIGHT TOPLEFT BOTTOMLEFT
Now we're getting somewhere. Games can have up to 9 moves, so let's create one which is a draw:
CENTER RIGHT TOP BOTTOM LEFT BOTTOMLEFT BOTTOMRIGHT TOPLEFT TOPRIGHT
I finally feel we are getting to the information-theoretic maximum density. Games can have up to 9 moves, and each move is uniquely determined by its position. Each move is one of 9, so we can represent that in 4 bits. Then a 9-move game of 4 bits each comes out to 36 bits... 5 bytes. Huh we're actually less compact than our 2-bit square state encoding (18 bits).
Ok, the absolute maximum density can be found by taking the base 2 logarithm of the size of the set of board states. So what are the board states?
- empty (0 moves): 1 of these
- 1 move: there are 9 of these
- 2 moves: for each of the 9 1-moves, there are 8 possible next moves. 9*8 = 72.
- 3 moves: similarly, for each of the 2-moves, there are 7 possible next moves. 72 * 7 = 504
- 4 moves: 6 possible moves here. 504 * 6 = 3024
- 5 moves: 5 moves here. 3024 * 5 = 15120
Now the game might be over: if X started making a row and O never stopped it, x wins. For example
X X X O O _ _ _ _
As we know, games end when a line is made horizontally, vertically, or diagonally. There are 3 horizontal lines, 3 vertical, and 2 diagonal. For the top horizontal row, O can take any 2 of the 6 bottom spots (doesn't matter which 2 O takes, they are all losing moves).
Here are the O positions among those 6:
O O _ _ _ _ O _ O _ _ _ O _ _ O _ _ O _ _ _ O _ O _ _ _ _ O _ O O _ _ _ _ O _ O _ _ _ O _ _ O _ _ O _ _ _ O _ _ O O _ _ _ _ O _ O _ _ _ O _ _ O _ _ _ O O _ _ _ _ O _ O _ _ _ _ O O
So we get 5 + 4 + 3 + 2 + 1 = 15 ways to distribute the 2 Os among the 6 losing spaces.
There are 8 lines for X to win on turn 5, and each of them can have 15 ways for O to position itself. But the order matters:
1 2 _ _ _ _
is different from
2 1 _ _ _ _
Similarly, the 8 lines of X can each be made different ways: 1 2 3, 1 3 2, 2 1 3, 2 3 1, 3 1 2, 3 2 1 = 6 ways.
So there are 8 * 15 * 6 * 2 = 1440 ways for the game to end on turn 5. Those are all valid but the game ends there. So for the 6th move, having 4 possibilities, rather than multiplying 4 * 15120 move 5 positions, we subtract the finished positions: (15120 - 1440) * 4 = 13680 * 4 = 54720 move 6 board states. Almost there!
There are again 8 lines O could have finished here, with 6 permutations each, and the 3 X moves that occupy the remaining 6 spaces have 15 ways to be distributed now with 6 permutations (rather than 2 as before).
8 * 6 * 15 * 6 = 4320 game-ending move 6 positions.
Continuing on, 54720 - 4320 = 50400 * 3 move 7s: 151200 positions. 8 lines end the game with a win for X, this time with 4 "extraneous" (non-game-winning-critical) moves. Those moves can be in any of the other 6 positions... I'm getting a bit tired of this but let's figure out what to subtract out. Of the 6 other positions, there will be 1 X and 3 Os. Let's list em out:
(ordering scheme: X, O, _, meaning if you can put an X before O, do it, if you can putvO before _ do it) X O O O _ _ X O O _ O _ X O O _ _ O X O _ O O _ X O _ O _ O X O _ _ O O X _ O O O _ X _ O O _ O X _ O _ O O X _ _ O O O O X O O _ _ O X O _ O _ O X O _ _ O ...
Alternatively since we're going to have to do ordering anyways, consider instead of X O O O, 1 2 3 4, with 2 blanks.
1 2 3 4 _ _ 1 2 4 3 _ _ 1 3 2 4 _ _ 1 3 4 2 _ _
Well whatever. As an upper bound, let's say the game didn't end, so we keep all the 151200 positions. Then there are 2 choices for the next move: 302400, and 1 final move. 302400. The upper bound for the total then is:
1 + 9 + 72 + 504 + 3024 + 15120 + 54720 + 151200 + 302400 + 302400
= 829450
log2 of that is just under 20... whoops we're more than our 2-bit encoding again! Symmetry is actually important as it turns out, the game state representation doesn't distinguish between
X O O _ X _ _ _ _
with the following orderings:
1 2 4 _ 3 _ _ _ _
1 4 2 _ 3 _ _ _ _
3 2 4 _ 1 _ _ _ _
3 4 2 _ 1 _ _ _ _
For the first move there is only 1 permutation. For the second also there is only 1 permutation. But for the 3rd, every board state can be gotten to 2 different ways.
1 9 72 / 2 = 36
Then for the 4th move, it can be achieved 4 different ways as we see above for example.
36 * 7 / 4 = 63. Much less than our original 504 here...
Keeping on going like this, we have as a new upper bound (leaving out winning positions, that's complicated as we saw)
1 9 36 63 63 * 6 / 6 / 2 ?? no longer an integer.
Ok so if we're representing the game this way we can disambiguate between the order the players moved, but the board state will be the same.
Yet another method: the first x position is one of 9. The second move, the first O, is one of 8. So we have the following options:
x0: 9 choices x1: 7 choices x2: 5 choices x3: 3 choices x4: 1 choice
O0: 8 choices O2: 6 choices O4: 4 choices O6: 2 choices
So we can encode the choices they make as follows:
x0: 4 bits x1: 3 bits x2: 3 bits x3: 2 bits x4: 1 bit (all this bit tracks is "did they make the last move" since it's not a choice) O0: 3 bits O2: 3 bits O4: 2 bits O6: 1 bit
Adding it all up... 4 + 3 + 3 + 2 + 1 + 3 + 3 + 2 + 1 = 13 + 9 = 22 bits... huh it's hard to beat the 18 bit naive packing!
What about the fact that our x0 has 9 choices, which we need 4 bits to encode, but then there are 7 "extra" choices we can use elsewhere? Assuming optimal packing, we can add up the choices like so
9 + 7 + 5 + 3 + 1 + 8 + 6 + 4 + 2 = 16 + 8 + 1 + 14 + 6 = 25 + 20 = 45 choices. ceil(log2 45) = 6 bits... interesting. 1 byte! Let's see if we can actually achieve this. For the first move, we use 4 bits to represent where it is on the 9-position grid. So if they take the top left, we have
x0 = 0
versus if they take the top edge: x0 = 1
etc.
0 1 2 3 4 5 6 7 8
0000 0001 0010
0011 0100 0101
0110 0111 1000
Then we can squeeze the second move in the unused positions... right?
1 000 1 001 1 010 1 011 1 100 1 101 1 110 1 111
yep! Though we do have ambiguity here as: 1000 could mean "X moved and it is O turn", or "X moved and O took position 0" (meaning X would have taken the top left corner and O would have taken the top edge). So let's use our free 2 bits of our byte to store what move number we are on.
(Edit: actually nope! 1000 is not an unused position, to store these 9 + 8 choices we need 5 bits. But we have 15 values unused)
GAME START 00 0000 01 1000 10 1000
means X went top left and O went top edge.
So much for the first 2 moves. The next X move has 7 choices, which we can represent with a 3-bit number.
So far then just for choices, we have 5 bits for x turn 1 and o turn 2. x turn 2 adds 7 choices; we can fit that in 5 bits. o turn 4 adds 5 choices, still within 5 bits. x turn 5 adds 4 choices, up to 9 + 8 + 7 + 6 + 5 = 35 bits. So we need a 6th bit. But we're almost there, o turn 6 adds only 4 choices, x turn 7 adds 3, o turn 8 adds 2 options, and x turn 9 has only 1 choice. Let's write out some example game states.
0: 00 00 00 | no moves 1: 00 00 01 | x has moved in the top left 2: 00 00 10 | x has moved in the top edge 3: 00 00 11 | x has moved in the top right ... 9: 00 10 01 | x has moved in the bottom right 10: 00 10 10 | x has moved in the top left and y has moved on the top edge 11: 00 10 11 | x has moved in the top left and y has moved on the top right ... 17 = 9 + 8: 01 00 01 | x has moved in the bottom right and y has moved on the bottom edge 18 = 9 + 8 + 1 | x has moved in the top left, y has moved on the bottom edge, x has moved on the top right ... 44: 10 11 00 | x has moved top edge, y top left, x top right, y left edge, x center, y right edge, x bottom left, y bottom edge, x bottom right 45: 10 11 01 | x has moved top left, y top edge, x top right, y left edge, x center, y right edge, x bottom left, y bottom edge, x bottom right... but not necessarily in that order. In fact I think in this scheme it'd be starting at the bottom right and moving left and up.
Cool! So in 6 bits, 1 byte, we have a tictactoe state?
Now would be a good time to make a number<=45 <-> tictactoe converter. Future project...