# Introduction

This article will summarize the work of different people in trying to solve different types of Texas Hold’em Poker, specifically cash limit games and no-limit tournaments. This paper does not discuss dynamic/adaptive strategies that try to exploit weaknesses in the opponent, but focuses on works that analyze poker through a game-theoretic lens, attempting to find or approximate equilibria. The first part of the article will discuss topics relating to two-player zero-sum games in general, with a focus on linear programming, the next part will go through a poker-related example of using these tools, and the last part will discuss the practical limitations of poker analysis and how to overcome them.

# Basic Poker Background

Texas Hold’em Poker has different variations, but the basics of the game are generally the same. I will not explain it all in detail, but here is the general flow. First, two players have to put up an ante (initial money into the pot), called the small blind and the big blind, the size of which depends on the game (the small blind is half the big blind). Then everyone is dealt a hand of two cards face down, and there is a round of betting (this is the pre-flop stage of the game). Then there is the flop, where 3 cards are drawn face up from the deck, and another round of betting happens. Then there are two more rounds, the turn and the river. In both the turn and the river, one more card is dealt face up and a round of betting happens. At this point, there are five face up “community cards." Everyone who hasn’t folded reveals their cards and the player with the best 5-card poker hand formed from their two-card hand and the 5 community cards wins the pot. In tournament poker, people play until there is only one person who still has chips, and receive prizes based on where they finished. In limit poker, there are restrictions on how much you can bet and how many times you can raise (normally a maximum of 3 raises after the first bet).

# Different Types of Texas Hold’em

A large chunk of the work done on analyzing Texas Hold’em Poker has been focused on the two-player cash limit game. When analyzing cash games, you are optimizing your expected return on playing one hand of poker; you don’t have to worry about stack size (how many chips you have left, normally measured in big blinds), as you play under the assumption that you are playing for stakes that fit your bankroll, so if you lose all your chips you can just buy in again. This is in contrast to tournament games, where you may play differently depending on your stack size.

One reason two-player Texas Hold’em games are a lot easier to analyze than games with larger than those with three or more players, besides the fact that there are less possible betting sequences, is that two-player zero-sum games can be solved exactly (given enough time for computation) with linear programming, which will be discussed later. On the other hand, games with three or more players require other techniques.

# Introduction to Minimax Theorem

Heads up cash games of Texas Hold’em poker are zero-sum, meaning that in every possible outcome, your payoff is the additive inverse of your opponent’s payoff. In such games, the optimal strategy $$s$$ for a player is a “maximin" strategy: the one that maximizes the player’s payoff assuming that the adversary will play the strategy that minimizes the player’s payoff, given that the player plays $$s$$. In other words, you play the strategy that maximizes your payoff in the worst-case scenario. This is an informal way of stating a result that follows from the Minimax Theorem, which says that if player 1 and player 2 both play their “maximin" strategies, the difference between their worst-case payoffs will be 0. In other words, any maximin strategy for player 2 is a best response to any maximin strategy for player 1, and vice-versa, and the only equilibria will be the ones where both players are playing maximin strategies.

Note: This statement (the only equilibria are when both players play maximin strategies) is extremely important, as it means that all equilibria lead to the same expected payoff for player 1. This payoff is often called the “value" of the game. In games with more players, there could be multiple equilibria that give player 1 different expected payoffs. This is another reason why two-player zero-sum games are especially nice, as you do not have this issue. As an exercise, try to prove the statement (Hint: do a proof by contradiction).

# Formal Statement of Minimax Theorem

The formal statement of the Minimax Theorem (the most specific version) is that for compact convex sets $$X \subset \mathbb{R}^m$$ and $$Y \subset \mathbb{R}^n$$, and $$A \in \mathbb{R}^{m \times n}$$, ${\displaystyle \max _{x\in X}\min _{y\in Y}x^{T}Ay=\min _{y\in Y}\max _{x\in X}x^{T}Ay}$ This can be applied to games by representing player 1’s strategy as an $$m$$-dimensional vector $$x \in X \subset \mathbb{R}^m$$ and player 2’s strategy by an $$n$$-dimensional vector $$y \in Y \subset \mathbb{R}^n$$, and choosing a “payoff matrix" $$A$$ such that player 1’s expected payoff is $$x^TAy$$. (The strategies these vectors represent will depend on the game. For example, in rock-paper-scissors you could have a 3-dimensional vector, with each element representing the probability of the player throwing rock, paper, and scissors.)

Usually, $$m$$ is the number of strategies player 1 can play, entry $$i$$ of the vector $$x$$ is the probability player 1 will choose to play strategy $$i$$, and $$A_{ij}$$ is the payoff to player 1 when player 1 plays strategy $$i$$ and player 2 plays strategy $$j$$. This makes it so that $$x^{T}Ay$$ is the expected payoff of player 1 when player 1 plays according to $$x$$ and player 2 plays according to $$y$$. Noting that $$-A_{ij}$$ is the payoff to player 2 when player 1 plays strategy $$i$$ and player 2 plays strategy $$j$$ (since it is a zero-sum game), we see that $$\displaystyle \max _{y\in Y}\min _{x\in X}x^{T}(-A)y$$ is the minimum payoff to player 2 when player 2 plays a maximin strategy, meaning $$\displaystyle \min _{y\in Y}\max _{x\in X}x^{T}(A)y = -\max _{y\in Y}\min _{x\in X}x^{T}(-A)y$$ is the maximum expected payoff to player 1 when player 2 plays a maximin strategy. Since $${\displaystyle \max _{x\in X}\min _{y\in Y}x^{T}Ay}$$ is the minimum expected payoff to player 1 when player 1 plays a maximin strategy, the theorem is saying that the maximum expected payoff to player 1 when player 2 plays a maximin strategy is the same as the minimum expected payoff of player 1 when player 1 plays a maximin strategy. This means that when player 1 plays a maximin strategy, they are guaranteeing themselves an expected payoff of at least $$V$$, and when player 2 plays their maximin strategy, they are guaranteeing that player 1 cannot obtain an expected payoff higher than $$V$$. In other words, if both players play their maximin strategy, there is no change any player can make to their strategy that will give them a higher expected payoff.

(This is actually a specific case of the original Minimax Theorem proved by John Von Neumann in 1928. The more general version can be found here: https://en.wikipedia.org/wiki/Minimax_theorem)

This theorem is deeply connected to the strong duality theorem of linear programming, which in words says that if you have a linear program that tries to maximize a quantity subject to some constraints, there is an alternative (dual) version of the problem where you are trying to minimize a quantity subject to some constraints, so that if either problem has a solution, the solutions are equivalent. It turns out that the left hand side (the maximization part) and the right hand side (the minimization part) are the solution to two linear programs that are duals of each other, so they will evaluate to the same result. We will see one of these linear programs in the following section. If you would like to learn more about duality, look here: https://en.wikipedia.org/wiki/Dual_linear_program

# Connection to Linear Programming

There are many forms of a linear program, but all of them boil down to the following problem statement: “maximize/minimize {linear combination of variables} subject to a list of weak inequalities of the form {linear combination of variables} $$\geq$$/$$\leq$$ {value}." In order to solve for an optimal strategy for player 1 and for player 1’s expected payoff, we want to turn this maximization problem into a linear program: find an $$x \in X$$ that maximizes $${\displaystyle \min _{y\in Y}x^{T}Ay}$$. Unfortunately, the quantity we are trying to maximize is not a linear function yet: it has a “$$\min$$" in it. In order to transform this problem into a linear program, we will add extra constraints.

(Note: We will set $$X$$ and $$Y$$ to be the set of all tuples with non-negative entries that sum to 1. In the example in the next section, this will not be the case, so we will end up with more constraints, but most of the time that is what you would have for $$X$$ and $$Y$$).

Note that $${\displaystyle \mathop{\mathrm{arg\,min}}_{y\in Y}x^{T}Ay}$$ always contains an element $$e_i$$, the $$i$$th unit vector in $$\mathbb{R}^n$$ (the $$i$$th unit vector in $$\mathbb{R}^n$$ is the vector with all 0’s except for the $$i$$th element, which is a 1). This is because $$x^{T}A$$ has dimensions 1 by $$n$$, and the elements of $$y$$ sum to 1, so, letting $$M$$ be the minimum element of $$x^{T}A$$, $$x^{T}Ay \geq M$$. If $$i$$ is the position of an element $$M$$ of $$x^{T}A$$, then $$x^{T}Ae_i = M$$, so $$e_i \in {\displaystyle \mathop{\mathrm{arg\,min}}_{y\in Y}x^{T}Ay}$$. This is a lengthy way of saying that for any strategy player 1 employs, player 2 always has a pure strategy best response.

This means that the statement $$V = {\displaystyle \min _{y\in Y}x^{T}Ay}$$ is equivalent to the following: $$\forall i \in [n]$$, $$V \leq x^{T}Ae_i$$, and $$V$$ is the maximum $$V$$ for which all of these constraints hold. So, the following linear program will work:

Maximize $$V$$ subject to: $x \geq 0$ $\sum_{i=0}^m x_i = 1$ $\forall i \in [n], V \leq x^{T}Ae_i$

$$V$$ will be the expected payoff for player 1, and $$x$$ will be one of player 1’s optimal/maximin strategies. To find player 2’s optimal strategy, one would run a similar linear program.

At this point, you should be wondering how you take a game and come up with $$A$$, $$X$$, and $$Y$$ in the first place. In the following sections we will do a concrete example. You may also be wondering how to solve a linear program. This is a rich topic that is outside the scope of this paper, but the most important thing to know is that they can be solved quickly in practice and in theory (linear in the number of variables and constraints) and there are free open-source resources that you can use to solve linear programs easily.

# Case Study: Preflop Analysis

In 1999, Alex Selby solved an abridged version of Texas Hold’em Poker, where there was only one round of betting (the pre-flop betting). This section is a summary/outline of what he did (or at least something very similar, omitting some details).

The first step is to pre-compute some important probabilities. We will label the 169 possible two-card hands a player can receive (13 pairs, $$13 \choose 2$$ suited hands, $$13 \choose 2$$ unsuited hands) arbitrarily with the numbers from $$1$$ to $$169$$ (suited means both cards in the hand have the same suit). We will pre-compute $$p_{ij}$$, the probability of player 1 (the small blind) being dealt hand $$i$$ and player 2 (the big blind) being dealt hand $$j$$. For example, the probability of player 1 being dealt $$AK$$ unsuited and player 2 being dealt $$KJ$$ suited is $$\frac{4 \cdot 3 \cdot 3 \cdot 1}{{52 \choose 2}{50 \choose 2}}$$ (4 options for player 1’s Ace, 3 options for player 1’s King, 3 options for player 2’s K, and 1 option for player 2’s Jack).

We will also need to compute $$e_{ij}$$, the probability of player 1 winning (assuming neither player folds) when player 1 has hand $$i$$ and player 2 has hand $$j$$. These computations are more nuanced. For example, in the previous example ($$AK$$ unsuited vs $$KJ$$ suited), you need to consider the case when the suit of player 1’s Ace matches the suit of player 2’s cards, as well as the case when they don’t match, as this will have an impact on who wins when the community cards show 4 cards of the same suit as player 2’s cards (in the first case, player 1 and player 2 will both have a flush, so player 1 will win with the Ace, but in the second case only player 2 has a flush, so unless player 1 has a full house player 2 will win).

The next step is to define the strategies for each player, based on the possible betting sequences. The possible betting sequences for preflop play are shown in this decision tree, where $$L$$ represents the small blind and $$B$$ represents the big blind. We also denote $$f$$ as fold, $$r$$ as raise, $$c$$ as call. Paths that end in a octagon represent the big blind winning, and paths that end in an arrow represent the small blind winning, and the numbers in the octagons/arrows are the size of the pot.

(Image taken from Borghetti, Brett. (2008). Opponent Modeling in Interesting Adversarial Environments. 163. https://www.researchgate.net/figure/Pre-flop-Betting-Game-Tree-for-Heads-Up-Limit-
Texas-Holdem-3-raise-limit-The-tree$$\_$$fig3$$\_$$235106672

Note that at each node (besides the small blind’s first move) there is only one action (raising) that does not terminate the betting, so we can define the strategies for each player based on their initial action, and what they would do in response to raises:

Small Blind:

1. Fold

2. Call, then fold

3. Call, then call

4. Call, then re-raise, then fold

5. Call, then re-raise, then call

6. Raise, then fold

7. Raise, then call

8. Raise, then re-raise

For the big blind, we will have two sets of strategies, one in response to the small blind calling, the other in response to the small blind raising:

Big Blind (response to call):

1. Fold

2. Check (AKA call with 0 chips)

3. Raise, then fold

4. Raise, then call

5. Raise, then re-raise

Big Blind (response to raise):

1. Fold

2. Call

3. Raise, then fold

4. Raise, then call

We will define variables $$L_{ij}$$ to be the probability of the small blind employing strategy $$j$$ when dealt hand $$i$$. The goal is to find the best values for $$L_{ij}$$ (best in the maximin sense), as this will determine the optimal strategy for player 1.

We will need a function $$v(i_1,j_1,i_2,j_2)$$ that tells us the expected payoff (measured in big blinds) to player 1 when player 1 is dealt hand $$i_1$$, player 2 is dealt hand $$i_2$$, player 1 employs strategy $$j_1 \in [8]$$, and player 2 employs strategy $$j_2$$ in response (this will either be in $$[5]$$ or $$[4]$$, depending on whether player 1’s strategy started with a call or a raise). This function will either be equal to some constant value if the two strategies result in one person folding (for example, if player 1 chooses strategy 1, folding immediately, then no matter what player 1 gets a payoff of -.5), or will be equal to some multiple of $$e_{{i_1}{i_2}} - (1 - e_{{i_1}{i_2}})$$. For example, if player 1 chooses strategy 3 and player 2 chooses strategies 3,4, or 5, then the action will go call-raise-call, resulting in a pot size of 4, so the expected payoff to player 1 will be $$2 \cdot e_{{i_1}{i_2}} - 2 \cdot (1 - e_{{i_1}{i_2}})$$. This is because player 1 gains $$2$$ big blinds if they win and loses $$2$$ big blinds if they lose, and the probability of them winning is, by definition, $$e_{{i_1}{i_2}}$$.

We will also need to define the variables $$V_{i1}, V_{i2}, V_{i3}$$ below.

Let $$V_{i1}$$ be the expected payoff to player 1 when player 2 is dealt hand $$i$$ and player 1 folds multiplied by the probability that player 2 is dealt hand $$i$$ and player 1 folds.

Let $$V_{i2}$$ be the expected payoff to player 1 when player 2 is dealt hand $$i$$ and player 1 calls multiplied by the probability that player 2 is dealt hand $$i$$ and player 1 calls.

Let $$V_{i3}$$, the expected payoff to player 1 when player 2 is dealt hand $$i$$ and player 1 raises multiplied by the probability that player 2 is dealt hand $$i$$ and player 1 raises.

That is a mouthful, but we are basically splitting up the expected payoff of player 1 into $$169 \cdot 3$$ cases, one for every possible combination of player 2 hand and player 1 starting move (one case for every possible information player 2 could have observed before deciding on a strategy).

Now we are ready to write the linear program. The first step is figuring out what we are trying to maximize (the “objective function"). We are trying to maximize player 1’s payoff, which is $$\sum_{i=1}^{169} V_{i1} + V_{i2} + V_{i3}$$, as mentioned before (this should be intuitive, but the formal justification is that we are using the Law of Total/Iterated Expectation).

Now we need some basic constraints on $$L_{ij}$$. Since $$L_{ij}$$ represent probabilities, we must have $$L_{ij} \geq 0$$. Also, we must have $$\sum_{j=1}^8 L_{ij} = 1$$ for all fixed $$i$$ (recall that $$i$$ is player 1’s hand, $$j$$ is a strategy, and $$L_{ij}$$ is the probability of player 1 choosing strategy $$j$$ when dealt hand $$i$$).

Now we need to figure out the constraints on $$V_{i1}$$, $$V_{i2}$$, and $$V_{i3}$$. Determining $$V_{i1}$$ is the easiest, as we already know that if player 1 immediately folds the payoff to player 1 is always $$-.5$$. Thus, $$V_{i1}$$ is the probability that player 1 folds multiplied by $$-.5$$, which gives us the constraints $V_{i1} = -.5\sum_{h=1}^{169} p_{hi}L_{h1}$ (Recall that $$L_{h1}$$ is the probability that player 1 immediately folds when he has hand $$h$$, and $$p_{hi}$$ is the probability of player 1 being dealt hand $$h$$ and player 2 being dealt hand $$i$$. The $$L_{h1}$$ are variables, and the $$p_{hi}$$ are pre-computed constants, so the right hand side is linear, even though it may look non-linear at first glance.) In other words, this uses the fact that the probability that player 2 is dealt hand $$i$$ and player 1 folds is equal to the sum over all possible hands $$h$$ of the probability that player 1 has hand $$h$$ and player 2 has hand $$i$$ and player 1 folds.

For $$V_{i2}$$, we need to use the fact that player 2 is playing a best response to player 1’s strategy. Remember that $$V_{i2}$$ covers the cases when player 1 calls, so we only need to consider player 2 playing the 5 strategies listed under “Big Blind (response to call)" and player 1 playing strategies in $$[2,5]$$. Thus, we will assume the player 2 is playing a strategy $$j \in [5]$$ that minimizes $$V_{i2}$$: $V_{i2} = {\displaystyle \min_{j \in [5]} \{\sum_{h=1}^{169}p_{hi}\sum_{x=2}^5v(h,x,i,j)L_{hx}\}}$ (This is actually extremely similar to the equation we had with $$V_{i1}$$, which could have been written as $$V_{i1} = \sum_{h=1}^{169}p_{hi}\sum_{x=1}^1v(h,x,i,j)L_{hx}$$. The only differences are that $$v(h,x,i,j)$$ is no longer constant with respect to $$h$$ (before it was always $$-.5$$), so we can’t factor it out, and the right hand side is no longer constant with respect to $$j$$, so we take the minimum over all possible $$j$$.)

Of course, this is not a valid constraint to have in a linear program since it has a “min," but we can use the trick from the previous section to write this as 5 separate constraints for each $$i$$ (one for each $$j \in [5]$$): $V_{i2} \leq \sum_{h=1}^{169}p_{hi}\sum_{x=2}^5v(h,x,i,j)L_{hx}$ (This is valid because in the objective function, $$V_{i2}$$ have positive coefficients, and there are no other constraints on $$V_{i2}$$.)

For $$V_{i3}$$ it is basically the same, except now we are looking at the cases when player 1 raised (so strategies in $$[6,8]$$ for player 1 and the four strategies listed under “Big Blind (response to raise)" for player 2). This gives us the constraints $V_{i3} \leq \sum_{h=1}^{169}p_{hi}\sum_{x=6}^8v(h,x,i,j)L_{hx}$ for every $$i \in [169]$$ and $$j \in [4]$$.

In total, we have $$169 \cdot 3 + 169 \cdot 8 = 1859$$ variables and $$169 \cdot 1 + 169 \cdot 5 + 169 \cdot 4 = 1690$$ constraints. Note that you could also replace the variables $$V_{i1}$$ with $$-.5\sum_{h=1}^{169} p_{hi}L_{h1}$$, which would remove 169 variables and 169 constraints. This is a lot, and Selby’s formulation was probably nicer, but this is definitely doable with the correct software.

I will not go over how you would compute player 2’s optimal strategy, but hopefully this gives you an idea of how to form linear programs to solve games like poker.

Some very interesting results were that, under this model, player 2 never folds, and player 1 rarely folds. Also, there is almost no "mixing" in the strategies for both players. In other words, for many values of $$i$$, $$L_{ij} = 1$$ for exactly one value of $$j$$ (and of course $$0$$ for all other values of $$j$$). This means that there is not a lot of randomization in how you play specific hands. Instead, most of the “bluffing" comes from playing hands of different strength in the same way (this is also something that was observed in the research described in the next section).

# Analysis of Two Player Limit Cash Games

The main struggle with finding optimal strategies for Limit Hold’em is that there are so many possible betting sequences (19 on each of four rounds). It has been verified experimentally that limiting players to 3 bets per round, which reduces this number to 13, does not have a great effect on optimal play. However, this still leaves 7 betting sequences in each round that lead to a future round. By the fourth round of betting, this means that there are 343 possible betting sequences that could have led the players to this point, still 169 possible hands a player could have, and even more combinations of the five community cards. To even describe a strategy, you would need to determine what each player does in each one of these cases given all the information they have. Multiple approaches have been taken to combat this.

The simplest was to use Selby’s pre-flop model as the strategy for pre-flop play, then solve the rest of the game round by round, in each round ignoring the implications of past betting. For example, if the pot has 4 big blinds at the start of the flop, the pre-flop action could have gone call-raise-call or raise-call, but the program does not take into account which player made the first raise. A better approach was to first solve a 3-round model where there is no betting on the river, and use whatever the optimal pre-flop strategy was for that game as the pre-flop strategy for the full game (which was found to give more accurate results than the Selby pre-flop model). Using that information, it is possible to compute the probability of the other player having a specific hand given that they bet in a certain way pre-flop. Given these constants, it is possible to formulate and run linear programs to determine an optimal strategy for the 3 rounds after and including the flop. Basically, this approach estimates the pre-flop strategy in a better way, and then is able to solve the rest of the game assuming both players follow that pre-flop strategy.

The second important hurdle is the number of possible hands a player can have. Many people have attacked this with a technique called bucketing, which groups similar hands and plays those hands in the same way. One method of bucketing is to calculate, for each hand, the probability of winning (“hand strength"), and group hands based on that. However, this overlooks hands that have a high chance of improving on future rounds, like suited hands and connectors that have flush and straight draws. To measure this “hand potential," one can compute the probability of being behind during the current round but then ahead after the next one. One bucketing system that proved to be effective was choosing 5 or 6 buckets based on hand strength, and reserving one bucket for hands with high potential (like suited connectors).

One thing to note with bucketing is that you have to do this for each round. For example, pre-flop you are putting 169 cards into buckets, but after the flop each pair of hole cards and flop is considered a different hand. This may suggest that using more buckets for future rounds may lead to more exact solutions, but in one paper they found that this did not affect the solutions that much. It also means that not only do you need to choose which hands are grouped together, you also need to compute the probabilities of transitioning from each pair of buckets to each pair of buckets in the next round (For example, we need to know the probability of player 1 having a hand in bucket 1 and player 2 having a hand in bucket 2 pre-flop, and then after the flop player 1 having a hand in bucket 2 and player 2 having a hand in bucket 2). This is another limitation on the number of buckets you can use, as using $$n$$ buckets for each round will require $$3n^4$$ probabilities to be calculated: $$(n^2)^2$$ calculations for each of $$3$$ transitions between rounds.

# Analysis of No-Limit Tournament Games

In no-limit games, there are even more options than in limit games. However, it has been shown that, for “short-stacked" (low stack size) no-limit tournaments, “jam-fold" strategies (strategies where you either go all-in or fold) closely approximate optimal strategy (for one specific two-player no-limit tournament, it was found that either player could only add a maximum of .014 to their chances of winning by playing any strategy other than jam-fold.) So, this area, although seemingly a lot harder to analyze, has been explored as well, even in the three-player case.

When analyzing tournament games, the player’s objectives are still to maximize their return, but their return is based on where they place (1st place, 2nd place, etc.) not based on how many chips they accumulate per hand. So, you have to know your probabilities of ending up in 1st place, 2nd place, etc. at all possible vectors of stack sizes for the players (for example, in a two player tournament with big blinds of 500 where each player starts at 2000 chips, the set of vectors of stack sizes (in chips) would be $$\{(0,4000),(250,3750),...(4000,0)\}$$). This makes it so that you also have to consider things like the probabilities of transitioning from one vector of stack sizes to another after one hand, as this will determine your chances of eventually getting 1st place or 2nd place, which will affect your strategy, which will in turn affect the transition probabilities, etc.

Thus, analysis of tournament games sometimes feature a “nested loop" approach, with an outer loop handling the stochastic part of the game (aka what was described in the previous paragraph, estimating the expected return for each player at each state), and an inner loop that updates each player’s optimal jam-fold strategy based on these probabilities. For a two-player game, the inner loop is basically just solving a linear programming problem for each state, while in a 3-player game other methods are required.

One method for the inner loop of the 3-player game is called “fictitious play," a loop where, int the $$t^{th}$$ step of the loop, every player updates their strategy according to the following formula: $\text{New Strategy} =$ $\frac{t - 1}{t} \cdot \text{ Old Strategy } + \frac{1}{t} \cdot \text {Best Response to Opponent's Old Strategy}$ This is continued until no player can gains more than a pre-specified amount $$\epsilon$$ by changing their strategy (in other words, an $$\epsilon$$-equilibrium has been reached). Although this algorithm is not guaranteed to end/converge, for tournament poker it did. So, the overall solution is to alternate between finding the strategies that lead to $$\epsilon$$-equilibrium in each state, then updating the probabilities of each player winning at each state, repeating this until these probabilities change by less than a pre-specified amount $$\delta$$ from one iteration to the next.

One thing to decide when using this approach is how to initialize the probabilities of each player winning at each state. For this, the “Independent Chip Model" (ICM) is used, which is a rule that says your probability of winning the tournament is approximately the proportion of your stack size to the sum of the stack sizes of all the players. For example, if player 1 had $$10$$ big blinds, player 2 had $$20$$, and player 3 had $$30$$, player 1 would have a $$\frac{10}{10+20+30} = \frac{1}{6}$$ chance of getting first and a $$\frac{20}{10+20+30} \cdot \frac{10}{10+30} + \frac{30}{10+20+30} \cdot \frac{10}{10+20} = \frac{1}{4}$$ chance of getting second (the first fraction covers the case where player 2 gets first, and the second one where player 3 gets first).

Some interesting findings were that in the the two-player setting, the strategies for tournament games and just playing a single hand (restricted to jam-fold strategies) were very similar, while in a three-player game they were not. Similarly, the ICM is a very good model in the two-player case, meaning the final probabilities of each player winning at each stack vector did not change much from the initial values, while in the three-player game there are certain stack vectors where it was not a good model. Finally, there is no fixed ranking among poker hands, meaning that depending on your stack size, you may fold one hand and jam another, but do the opposite if your stack size is different.

# Bibliography

1. https://en.wikipedia.org/wiki/Minimax$$\_$$theorem

2. https://en.wikipedia.org/wiki/Dual$$\_$$linear$$\_$$program#: :text=The$$\%$$20strong$$\%$$20duality$$\%$$20theorem
$$\%$$20states,of$$\%$$20duality$$\%$$20theorems$$\%$$20in$$\%$$20optimization.

3. http://www.archduke.org/simplex/art (Selby’s findings)

6. https://www.cs.cmu.edu/ sandholm/3-player%20jam-fold.AAMAS08.pdf

7. https://poker.cs.ualberta.ca/publications/IJCAI03.pdf

8. https://webdocs.cs.ualberta.ca/ duane/publications/pdf/2002aij.pdf