Mastering Tic-Tac-Toe with the Minimax Algorithm
Understanding Tic-Tac-Toe
Before we dive into the Minimax algorithm, let’s review the rules of Tic-Tac-Toe:
- The game is played on a 3x3 grid.
- Two players take turns, one using “X” and the other using “O.”
- The objective is to get three of your symbols in a row, either horizontally, vertically, or diagonally.
- If the entire grid is filled without a winner, the game ends in a draw.
Tic-Tac-Toe has a relatively small game tree, making it an ideal candidate for demonstrating the Minimax algorithm. Despite its apparent simplicity, Tic-Tac-Toe offers an opportunity to delve into the world of artificial intelligence and algorithms. We will explore the Minimax algorithm and how it can be used to create an unbeatable Tic-Tac-Toe AI.
The Minimax Algorithm
The Minimax algorithm is a decision-making technique commonly used in two-player games with perfect information, such as Tic-Tac-Toe. It is designed to determine the best move for a player by considering all possible moves and their outcomes.
The algorithm operates on the assumption that both players are playing optimally, i.e., trying to maximize their own chances of winning while minimizing their opponent’s chances. The name “Minimax” comes from the two key components of the algorithm:
-
Minimize: When it’s the opponent’s turn, the algorithm aims to minimize their potential gains. It assumes the opponent will make the best move for them.
-
Maximize: When it’s the AI’s turn, the algorithm aims to maximize its potential gains. It selects the move that leads to the highest possible outcome.
Let’s attempt to mathematically model our problem. We denote by \(t\) the number of actions played since the beginning of the game (so \(t\le 9\)), and we denote \(S_t\) as the state of the game at time \(t\). We can summarize the state of the game as a sequence of actions:
\[S_t = [a_1, \dots a_t]\]The objective is to find the best action \(a_{t+1}\) to play and transition the game into the state \(S_{t+1} = S_t + [a_{t+1}]\). For this, we will need to construct a heuristic that assigns a score to possible actions. We will simply choose the action that produces the highest score.
Thus, denoting \(h\) as the heuristic, we have
\[a_{t+1} = \underset{a}{\mathrm{argmax}} \: h(S_t+[a],\,0,\,\textbf{min})\]The Heuristic
The heuristic is written as \(h(S,d,\textbf{op})\) where \(S\) is the evaluated state of the game, \(d\) is the depth of recursion, and \(\textbf{op}\) is the next score selection policy (\(\textbf{min}\) or \(\textbf{max}\)).
\[h(S,\,d,\,\textbf{min}) = \left\{ \begin{array}{cl} -10 + d & : \text{if} \ S\ \text{is a loosing state} \\ 10 - d & : \text{if} \ S\ \text{is a winning state} \\ 0 & : \text{if} \ S\ \text{is a draw state} \\ \underset{a}{\textbf{min}}\ h(S+[a],\, d+1,\, \textbf{max}) & : \text{otherwise} \end{array} \right.\]The choice of the heuristic is what constitutes the intelligence behind the minimax algorithm. We assign a score of \(0\) when the game results in a draw. A negative score when it’s a loss with a penalty that increases as the loss becomes imminent. And a positive score when it’s a win, with a higher score as the win becomes imminent.
The imminence is determined by the term \(d\), which represents the number of recursions in the calculation of the heuristic.
And that’s all there is to it! The Minimax algorithm is a powerful technique for creating intelligent game-playing agents. By systematically exploring all possible moves and evaluating their outcomes, the AI can make decisions that are difficult to beat. And for the Tic-Tac-Toe, the IA is even unbeatable.
Application to the Tic-Tac-Toe game
To use the minimax algorithm for tic-tac-toe, you need to define the game state space. You can choose a 3x3 character array where:
- ’ ‘ represents an empty cell,
- ‘X’ represents a move made by the computer (AI),
- ‘O’ represents a move made by the player.
First, you’ll want to write an evaluation function that assesses the state of the game.
Then you need to implement the heuristic function
And finally the function that use the heuristic to do the action