Skip to article frontmatterSkip to article content

Rationality

Motivating Example: Competing Restaurant Locations

Two restaurant chains, FreshBite and UrbanEats, are expanding into a new city. Each must choose one of three neighbourhoods for a flagship store: Downtown, Midtown, or Suburb.

The projected profits (in £1000s) for each company depend on both their own choice and their competitor’s. Since they are competing in the same market, outcomes reflect strategic interaction.

Table 1 and Table 2 show the expected profits for each company based on their own decision (row) and their competitor’s choice (column).

Table 1:Projected profits for FreshBite (in thousands of pounds)

DowntownMidtownSuburb
Downtown432
Midtown323
Suburb211

Table 2:Projected profits for UrbanEats (in thousands of pounds)

DowntownMidtownSuburb
Downtown103
Midtown026
Suburb245

These tables are assumed to be publicly available from a consultation commissioned by the city council.

The payoff matrices for the equivalent Normal Form Game are:

Mr=(432323211)Mc=(102024265)M_r=\begin{pmatrix} 4 & 3 & 2\\ 3 & 2 & 3\\ 2 & 1 & 1 \end{pmatrix} \qquad M_c=\begin{pmatrix} 1 & 0 & 2\\ 0 & 2 & 4\\ 2 & 6 & 5 \end{pmatrix}

Let us begin from FreshBite’s perspective: Suburb results in lower profits than Midtown or Downtown, regardless of UrbanEats’ action. A rational FreshBite would therefore never choose Suburb, as illustrated in Figure 1 although it might still choose Midtown or Downtown.

Two payoff matrices with a line crossing out Suburb in the row player matrix

Figure 1:Crossing out Suburb as an action that will never be played by the row player.

With Suburb eliminated for FreshBite, UrbanEats can update their view. Against the remaining options (Downtown and Midtown), UrbanEats consistently receives higher profits by choosing Suburb rather than either alternative. Hence, Downtown and Midtown can be removed from consideration, as shown in Figure 2.

Two payoff matrices with a line crossing out Midtown in the column player matrix

Figure 2:Crossing out Midtown and Downtown as actions that will never be played by the column player.

FreshBite, now comparing Downtown and Midtown against only Suburb from UrbanEats, finds that Midtown yields a better outcome. They therefore eliminate Downtown, leaving Midtown as their only rational option, as shown in Figure 3.

Two payoff matrices with a line crossing out Midtown in the column player matrix

Figure 3:Crossing out Downtown as an action that will never be played by the row player.

The step-by-step logic of iterated elimination of strategies, combined with the assumption of full information and mutual rationality, leads to the predicted outcome:

In the next section, we introduce the theory of dominance and best responses. These concepts provide the formal tools for identifying outcomes like this one, and for understanding how strategic agents simplify complex decisions.

Theory

In certain games it is evident that certain strategies should never be used by a rational player. To formalise this we need a couple of definitions.

Definition: Incomplete Strategy Profile


In an NN-player normal form game, when focusing on player ii, we denote by s(i)s_{(-i)} an incomplete strategy profile representing the strategies chosen by all players except player ii.


For example, in a 3-player game where Ai={A,B}\mathcal{A}_i = \{A, B\} for all ii, a valid strategy profile might be s=((1,0), (1,0), (0,1))s = ((1, 0),\ (1, 0),\ (0, 1)), indicating that players 1 and 2 choose AA, and player 3 chooses BB.

The corresponding incomplete strategy profile for player 2 is: s(2)=((1,0), (0,1))s_{(-2)} = ((1, 0),\ (0, 1)) — capturing only the choices of players 1 and 3.

We write S(i)S_{(-i)} to denote the set of all such incomplete strategy profiles from the perspective of player ii.

This notation allows us to define a key concept in strategic reasoning.


Definition: Strictly Dominated Strategy


In an NN-player normal form game, an action aiAia_i \in \mathcal{A}_i is said to be strictly dominated if there exists a strategy σiΔ(Ai)\sigma_i \in \Delta(\mathcal{A}_i) such that:

ui(σi,s(i))>ui(ai,s(i))for all s(i)S(i).u_i(\sigma_i, s_{(-i)}) > u_i(a_i, s_{(-i)}) \quad \text{for all } s_{(-i)} \in S_{(-i)}.

That is, no matter what the other players do, player ii always receives a strictly higher payoff by playing σi\sigma_i instead of the action aia_i.

When modelling rational behaviour, such actions can be systematically excluded from consideration.

Definition: Weakly Dominated Strategy


In an NN-player normal form game, an action aiAia_i \in \mathcal{A}_i is said to be weakly dominated if there exists a strategy σiΔ(Ai)\sigma_i \in \Delta(\mathcal{A}_i) such that:

ui(σi,si)ui(si,si)for all siSi,u_i(\sigma_i, s_{-i}) \geq u_i(s_i, s_{-i}) \quad \text{for all } s_{-i} \in S_{-i},

and

ui(σi,sˉ)>ui(si,sˉ)for some sˉSi.u_i(\sigma_i, \bar{s}) > u_i(s_i, \bar{s}) \quad \text{for some } \bar{s} \in S_{-i}.

That is, σi\sigma_i does at least as well as sis_i against all possible strategies of the other players, and strictly better against at least one.

As with strictly dominated strategies, we can use weak dominance to guide predictions of rational behaviour by eliminating weakly dominated strategies. Doing so, however, relies on a deeper assumption about what players know about one another.

Common Knowledge of Rationality

So far, we have assumed that players behave rationally. But to justify iterative elimination procedures, we must go further. The key idea is that not only are players rational, but they also know that others are rational—and that this knowledge is shared recursively.

This leads to the concept of Common Knowledge of Rationality (CKR):

This infinite chain of mutual knowledge is what constitutes common knowledge.

By assuming CKR, we justify the use of techniques like the iterated elimination of dominated strategies as a model of rational prediction.

Example: Predicted Behaviour through Iterated Elimination Strategies

Consider the following normal form game, with separate payoff matrices for the row and column players:

Mr=(110002)Mc=(021311)M_r = \begin{pmatrix} 1 & 1 & 0 \\ 0 & 0 & 2 \end{pmatrix} \qquad M_c = \begin{pmatrix} 0 & 2 & 1 \\ 3 & 1 & 1 \end{pmatrix}

We proceed by examining whether any strategies can be removed based on weak dominance.

Step 1: From the column player’s perspective, column c3c_3 is weakly dominated by column c2c_2. That is, for every row, c2c_2 performs at least as well as c3c_3, and strictly better for at least one row. We remove c3c_3 from consideration.

Step 2: With c3c_3 removed, the row player now compares r1r_1 and r2r_2. Row r2r_2 is weakly dominated by r1r_1, since r1r_1 yields outcomes that are always at least as good and sometimes strictly better. We eliminate r2r_2.

Step 3: Finally, the column player is left with c1c_1 and c2c_2. Column c1c_1 is now weakly dominated by c2c_2, and can be eliminated.

After these steps, we are left with a single strategy profile:

s=(r1,c2)s = (r_1, c_2)

This strategy profile is a predicted outcome based on iterated elimination of weakly dominated strategies, assuming mutual rationality and full knowledge of the payoffs.

Example: eliminating strategies does not suffice

In many cases, we can simplify a game by eliminating dominated strategies. However, not all games can be fully resolved using this approach.

Consider the following two games.

Mr=(142404235)Mc=(322031546)M_r = \begin{pmatrix} 1 & 4 & 2 \\ 4 & 0 & 4 \\ 2 & 3 & 5 \end{pmatrix} \qquad M_c = \begin{pmatrix} 3 & 2 & 2 \\ 0 & 3 & 1 \\ 5 & 4 & 6 \end{pmatrix}

Although some weakly dominated strategies may be eliminated here, we still end up with multiple strategy combinations remaining. No single obvious solution emerges from dominance alone. Here is another example:

Mr=(3012)Mc=(2013)M_r = \begin{pmatrix} 3 & 0 \\ 1 & 2 \end{pmatrix} \qquad M_c = \begin{pmatrix} 2 & 0 \\ 1 & 3 \end{pmatrix}

In this game, no strategy is strictly or weakly dominated for either player. Elimination techniques do not reduce the strategy space, and further analysis is required to predict behaviour.

These examples show that dominance is a helpful tool, but not always sufficient. We need further concepts—such as best responses to analyse such games. Payoff matrices for the row and column players:

Definition: Best Response


In an NN-player normal form game, a strategy ss^* for player ii is a best response to some incomplete strategy profile sis_{-i} if and only if:

ui(s, si)ui(s, si)for all sΔ(Ai).u_i(s^*,\ s_{-i}) \geq u_i(s,\ s_{-i}) \quad \text{for all } s \in \Delta(\mathcal{A}_i).

That is, ss^* gives player ii the highest possible payoff, given the strategies chosen by the other players.

We can now begin to predict rational behaviour by identifying the best responses to actions of the other players.

Example: Predicted Behaviour through Best Responses in the Action Space

Consider the following game with payoff matrices for the row and column players:

Mr=(142404235)Mc=(322031546)M_r = \begin{pmatrix} 1 & 4 & 2 \\ 4 & 0 & 4 \\ 2 & 3 & 5 \end{pmatrix} \qquad M_c = \begin{pmatrix} 3 & 2 & 2 \\ 0 & 3 & 1 \\ 5 & 4 & 6 \end{pmatrix}

We highlight best responses by underlining the maximum value(s) in each row (for the column player) and in each column (for the row player):

Mr=(142404235)Mc=(322031546)M_r = \begin{pmatrix} 1 & \underline{4} & 2 \\ \underline{4} & 0 & 4 \\ 2 & 3 & \underline{5} \end{pmatrix} \qquad M_c = \begin{pmatrix} \underline{3} & 2 & 2 \\ 0 & \underline{3} & 1 \\ 5 & 4 & \underline{6} \end{pmatrix}

Here, ((0,0,1),(0,0,1))((0, 0, 1), (0, 0, 1)) is a pair of best responses:

This mutual best response suggests a potential stable outcome.

In the next example we will consider finding best responses in the entire strategy space and not just the action space.

Example: Predicted Behaviour through Best Responses in the Strategy Space

We can also identify best responses when players use strategies. Let us begin by examining the Matching Pennies game.

Example: Matching Pennies

Recalling the Payoff matrices for the row and column players:

Mr=(1111)Mc=(1111)M_r = \begin{pmatrix} 1 & -1 \\ -1 & 1 \end{pmatrix} \qquad M_c = \begin{pmatrix} -1 & 1 \\ 1 & -1 \end{pmatrix}

Assume that player 2 plays a strategy σ2=(x, 1x)\sigma_2 = (x,\ 1 - x). Then player 1’s expected utilities are:

u1(r1, σ2)=2x1andu1(r2, σ2)=12xu_1(r_1,\ \sigma_2) = 2x - 1 \qquad \text{and} \qquad u_1(r_2,\ \sigma_2) = 1 - 2x

We will use some code to plot these two payoff values:

import matplotlib.pyplot as plt
import numpy as np

x = np.array((0, 1))

plt.figure()
plt.plot(2 * x - 1, label=r"$u_1(r_1, \sigma_2) = 2x - 1$")
plt.plot(1 -  2 * x, label=r"$ u_1(r_2, \sigma_2) = 1 - 2x $")
plt.title("Utility of the row player's actions as a function of the stratey played by the column player")
plt.xlabel("$x$")
plt.legend();

From the graph we observe:

  1. If x<12x < \frac{1}{2}, then r2r_2 is a best response for player 1.
  2. If x>12x > \frac{1}{2}, then r1r_1 is a best response for player 1.
  3. If x=12x = \frac{1}{2}, then player 1 is indifferent between r1r_1 and r2r_2.

Examples: Coordination Game

Recalling the Payoff matrices for the row and column players:

Mr=(3012)Mc=(2013)M_r = \begin{pmatrix} 3 & 0 \\ 1 & 2 \end{pmatrix} \qquad M_c = \begin{pmatrix} 2 & 0 \\ 1 & 3 \end{pmatrix}

Assume that player 1 plays a strategy σ1=(x, 1x)\sigma_1 = (x,\ 1 - x). Then player 2’s expected utilities are:

u1(r1, σ2)=2x+(1x)=x+1andu1(r2, σ2)=3(1x)=33xu_1(r_1,\ \sigma_2) = 2x + (1 - x) = x + 1 \qquad \text{and} \qquad u_1(r_2,\ \sigma_2) = 3(1 - x) = 3 - 3 x

We will use some code to plot these two payoff values:

import matplotlib.pyplot as plt
import numpy as np

x = np.array((0, 1))

plt.figure()
plt.plot(x + 1, label=r"$u_2(\sigma_1, c_1) = x + 1$")
plt.plot(3 - 3 * x, label=r"$ u_2(\sigma_2, c_2) = 3 - 3 x $")
plt.title("Utility of the column player's actions as a function of the stratey played by the row player")
plt.xlabel("$x$")
plt.legend();

We can now determine the best response for player 1 based on the value of xx:

  1. If x<12x < \frac{1}{2}, then c1c_1 is a best response.
  2. If x>12x > \frac{1}{2}, then c2c_2 is a best response.
  3. If x=12x = \frac{1}{2}, then player 1 is indifferent between c1c_1 and c2c_2.

This example shows how best responses depend continuously on the opponent’s strategy, and that indifference can arise at precise thresholds in strategies.

Theorem: Best Response Condition


In a two-player game (A, B)(Rm×n)2(A,\ B) \in \left(\mathbb{R}^{m \times n}\right)^2, a strategy σr\sigma_r^* of the row player is a best response to a strategy σc\sigma_c of the column player if and only if:

σr(i)>0(AσcT)i=maxkA1(AσcT)kfor all iA1\sigma_{r^*}(i) > 0 \quad \Rightarrow \quad (A \sigma_c^\mathsf{T})_i = \max_{k \in \mathcal{A}_1}(A \sigma_c^\mathsf{T})_k \quad \text{for all } i \in \mathcal{A}_1

Proof

The term (AσcT)i(A \sigma_c^\mathsf{T})_i represents the utility for the row player when playing their ithi^{\text{th}} action. Thus:

σrAσcT=i=1mσr(i)(AσcT)i\sigma_r A \sigma_c^\mathsf{T} = \sum_{i=1}^{m} \sigma_{r}(i) \cdot (A \sigma_c^\mathsf{T})_i

Let u=maxk(AσcT)ku = \max_k (A \sigma_c^\mathsf{T})_k. Then:

σrAσcT=i=1mσr(i)[uu+(AσcT)i]=i=1mσr(i)ui=1mσr(i)(u(AσcT)i)=ui=1mσr(i)(u(AσcT)i)\begin{align} \sigma_r A \sigma_c^\mathsf{T} &= \sum_{i=1}^{m} \sigma_r(i) \left[u - u + (A \sigma_c^\mathsf{T})_i\right] \\ &= \sum_{i=1}^{m} \sigma_r(i) u - \sum_{i=1}^{m} \sigma_r(i) (u - (A \sigma_c^\mathsf{T})_i) \\ &= u - \sum_{i=1}^{m} \sigma_r(i) (u - (A \sigma_c^\mathsf{T})_i) \end{align}

Since u(AσcT)i0u - (A \sigma_c^\mathsf{T})_i \geq 0 for all ii, the maximum expected utility for the row player is uu, and this occurs if and only if:

σr(i)>0(AσcT)i=u\sigma_r(i) > 0 \quad \Rightarrow \quad (A \sigma_c^\mathsf{T})_i = u

as required.


Example: Rock Paper Scissors Lizard Spock

The classic Rock Paper Scissors game can be extended to Rock Paper Scissors Lizard Spock, where:

This results in the following payoff matrices:

Mr=(0111110111110111110111110)Mc=MrM_r = \begin{pmatrix} 0 & -1 & 1 & 1 & -1 \\ 1 & 0 & -1 & -1 & 1 \\ -1 & 1 & 0 & 1 & -1 \\ -1 & 1 & -1 & 0 & 1 \\ 1 & -1 & 1 & -1 & 0 \\ \end{pmatrix} \qquad M_c = -M_r

We can now use the Best Response Condition Theorem to check whether the strategy σ1=(13, 0, 13, 13, 0)\sigma_1 = \left(\frac{1}{3},\ 0,\ \frac{1}{3},\ \frac{1}{3},\ 0\right) is a best response to σ2=(0, 14, 0, 0, 34)\sigma_2 = \left(0,\ \frac{1}{4},\ 0,\ 0,\ \frac{3}{4}\right).

We begin by computing the expected utilities for each of player 1’s actions:

Mrσ2T=(0.750.50.50.250.5)M_r \sigma_2^\mathsf{T} = \begin{pmatrix} 0.75 \\ -0.5 \\ 0.5 \\ -0.25 \\ -0.5 \end{pmatrix}

From this vector, we observe that:

maxkA1(Aσ2T)k=0.75\max_{k \in \mathcal{A}_1} (A \sigma_2^\mathsf{T})_k = 0.75

The support of σ1\sigma_1 is S(σ1)={1, 3, 4}\mathcal{S}(\sigma_1) = \{1,\ 3,\ 4\}. Among these, only (Aσ2T)1=0.75(A \sigma_2^\mathsf{T})_1 = 0.75 equals the maximum.

Since (Aσ2T)3=0.5(A \sigma_2^\mathsf{T})_3 = 0.5 and (Aσ2T)4=0.25(A \sigma_2^\mathsf{T})_4 = -0.25, the condition for a best response is not satisfied.

Therefore, σ1\sigma_1 is not a best response to σ2\sigma_2. Indeed, the row player should revise their strategy to place positive weight only on the action that yields the maximum payoff.

Exercises

Exercise: Iterated Elimination of Strategies

Use iterated elimination of dominated strategies to attempt to predict rational behaviour in each of the following games. Represent all steps clearly.

  1. A=(2111)B=(1113)A = \begin{pmatrix} 2 & 1 \\ 1 & 1 \end{pmatrix} \qquad B = \begin{pmatrix} 1 & 1 \\ 1 & 3 \end{pmatrix}
  2. A=(213172731146718)B=(11910220110210120)A = \begin{pmatrix} 2 & 1 & 3 & 17 \\ 27 & 3 & 1 & 1 \\ 4 & 6 & 7 & 18 \end{pmatrix} \qquad B = \begin{pmatrix} 11 & 9 & 10 & 22 \\ 0 & 1 & 1 & 0 \\ 2 & 10 & 12 & 0 \end{pmatrix}
  3. A=(332213)B=(213232)A = \begin{pmatrix} 3 & 3 & 2 \\ 2 & 1 & 3 \end{pmatrix} \qquad B = \begin{pmatrix} 2 & 1 & 3 \\ 2 & 3 & 2 \end{pmatrix}
  4. A=(3127)B=(3116)A = \begin{pmatrix} 3 & -1 \\ 2 & 7 \end{pmatrix} \qquad B = \begin{pmatrix} -3 & 1 \\ 1 & -6 \end{pmatrix}

Exercise: Identifying Best Responses

For each game in Exercise: Iterated Elimination of Strategies:

  1. Identify all best response actions.
  2. Attempt to predict rational outcomes using best response reasoning.
  3. Explain any instances where this approach fails to fully determine the outcome (e.g. games with multiple best responses).

Exercise: Representing Peace and War in Normal Form

Model the following real-world conflict scenario as a normal form game.

Two neighbouring countries possess powerful military forces.

  • If both attack, each suffers 10,000 civilian casualties.
  • If one attacks while the other remains peaceful, the peaceful country suffers 15,000 casualties and retaliates, causing the attacker 13,000 casualties.
  • If both remain peaceful, there are no casualties.

Answer the following:

  1. Clearly state the players and their action sets.
  2. Represent the game in normal form, assuming utilities are the negative of casualties.
  3. Plot the expected utilities for each country assuming it plays a generic strategy while the other plays:
    • Peaceful
    • Aggressive
  4. Identify the best responses of each country.

Exercise: Pricing Strategy in Competing Shops

Two shops on the same street are deciding how to price their product. Each can set the price as Low, Medium, or High.

  • If both choose the same price, they split the market.
  • If one sets a lower price, they attract more customers but earn less per item.
  • If one sets a higher price, they retain fewer customers but may benefit from higher margins.
  1. Define a plausible utility table for both players.
  2. Represent the game in normal form using two matrices.
  3. Identify and eliminate any actions that are never chosen under rational behaviour.
  4. Discuss whether this game has any pairs of actions that are best responses to each other.

Exercise: Task Assignment in a Shared Workspace

Two workers must each choose a task to complete in a shared workspace: Documentation, Coding, or Debugging.

  • If they choose the same task, productivity drops due to coordination overhead.
  • If they choose different tasks, overall productivity improves.
  • Each task has a different utility value depending on the player’s preference and skill.
  1. Construct a utility matrix for both players that reflects this scenario.
  2. Represent the game in normal form.
  3. Are any actions strictly or weakly dominated?
  4. Determine best responses for both players.

Programming

Computing utilities of all actions

Theorem: Best Response Condition relies on computing the utilities of all actions when the other player is playing a specific strategy. This corresponds to the matrix multiplication:

MrσcTM_r \sigma_c^\mathsf{T}

Similarly to Linear Algebraic Form of Expected Utility this enables efficient numerical computation in programming languages that support vectorized matrix operations.

Computing utilities of all actions with Numpy

Python’s numerical library NumPy Harris et al., 2020 provides vectorized operations through its array class. Below, we define the elements from the earlier Rock Paper Scissors Lizard Spock example:

M_r = np.array(
    (
        (0, -1, 1, 1, -1),
        (1, 0, -1, -1, 1),
        (-1, 1, 0, 1, -1),
        (-1, 1, -1, 0, 1),
        (1, -1, 1, -1, 0),
    )
)

sigma_2 = np.array((1 / 4, 0, 0, 3 / 4, 0))

We can now compute the expected utility for each of the row player’s actions:

M_r @ sigma_2

Finding non zero entries of an array with Numpy

NumPy gives functionality for finding non zero entries of an array:

sigma_1 = np.array((1 / 3, 0, 1 / 3, 1 / 3, 0))
sigma_1.nonzero()

Checking the best response condition with Numpy

To check the best response condition with NumPy we can efficiently compare the location of the non zero values of a strategy with the maximum utility of the action set.

(M_r @ sigma_2)[sigma_1.nonzero()] == (M_r @ sigma_2).max()

Just as in Example: Rock Paper Scissors Lizard Spock we see that σ1\sigma_1 is not a best response as all the non zero values do not give maximum utility. If we modify the row player’s behaviour to only play the first action then we do get a best response:

sigma_1 = np.array((1, 0, 0, 0, 0))
(M_r @ sigma_2)[sigma_1.nonzero()] == (M_r @ sigma_2).max()

Checking best response condition with Nashpy

The Python library Nashpy Knight & Campbell, 2018 can be used to directly check the best response condition for two strategies. It checks if either strategy is a best response to each other.

import nashpy as nash

rpsls = nash.Game(M_r, - M_r)
rpsls.is_best_response(sigma_r=sigma_1, sigma_c=sigma_2)

This confirms that σ1=(1,0,0,0,)\sigma_1=(1, 0, 0, 0, ) is a best response to σ2=(1/4,0,0,3/4,0)\sigma_2=(1 / 4, 0, 0, 3 / 4, 0) but not vice versa.

Notable Research

In Nash Jr, 1950, John Nash introduced the foundational concept of Nash equilibrium: a strategy profile (i.e., a combination of strategies, one per player) where each strategy is a best response to the others. In such a setting, no player has an incentive to unilaterally deviate from their choice. Nash’s landmark result showed that at least one such equilibrium exists in any finite game — a result that earned him the Nobel Prize in Economics.

More recent research explores how often certain types of equilibria occur. For instance, Wiese & Heinrich, 2022 investigates how frequently best response dynamics (where players iteratively switch to their best response) lead to convergence at a Nash equilibrium. This builds on earlier work such as Dresher, 1970, which studied the likelihood that a game possesses a Nash equilibrium in actions.

Another line of research explores how the structure of a game influences the convergence to equilibrium. For example, in congestion games or routing games, players choose paths through a network, and each player’s cost depends on the congestion caused by others. These games often admit Nash equilibria in actions and allow best response dynamics to converge under mild conditions. This property has led to applications in traffic flow, internet routing, and load balancing. The work of Rosenthal, 1973 formally introduced this class of games and remains a cornerstone of algorithmic game theory.

In an applied context, Knight et al., 2017 models interactions between hospitals under performance-based targets. The authors prove the existence of an equilibrium in actions (as opposed to more general strategies). This finding provides insight into how policy incentives might be structured to align hospital decision-making with public health outcomes.

Conclusion

In this chapter, we introduced fundamental tools for predicting strategic behaviour: dominated strategies, best responses, and the assumption of common knowledge of rationality. These ideas allowed us to identify outcomes that rational players would be unlikely to avoid—such as the elimination of actions that never perform best and the selection of strategies that respond optimally to others.

Using both motivating examples and formal definitions, we saw how iterated elimination and best response analysis can reveal plausible behaviours in one-shot games. We also explored how reasoning about rationality can be extended beyond pure actions to strategies, where probability plays a role in decision-making.

These concepts (summarised in Table 3) lay the foundation for understanding equilibrium—the central concept where players’ strategies mutually reinforce each other. In the next chapter, we formalise this idea, introducing Nash equilibrium and exploring its existence, interpretation, and implications across different types of games.

Table 3:Summary of core concepts introduced in this chapter.

ConceptDescription
Dominated StrategyAn action that is always worse than another; never rational to play
Iterated EliminationSequentially removing dominated strategies to simplify the game
Best ResponseThe strategy that maximises a player’s utility given others’ choices
Strategy ProfileA complete specification of strategies for all players
Incomplete Strategy ProfileStrategies for all players except one
Common Knowledge of RationalityAll players are rational, and know that others are, recursively
Best Response ConditionA formal test for optimality in response to a strategy
References
  1. Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., … Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. 10.1038/s41586-020-2649-2
  2. Knight, V., & Campbell, J. (2018). Nashpy: A Python library for the computation of Nash equilibria. Journal of Open Source Software, 3(30), 904.
  3. Nash Jr, J. F. (1950). Equilibrium points in n-person games. Proceedings of the National Academy of Sciences, 36(1), 48–49.
  4. Wiese, S. C., & Heinrich, T. (2022). The frequency of convergent games under best-response dynamics. Dynamic Games and Applications, 1–12.
  5. Dresher, M. (1970). Probability of a pure equilibrium point in n-person games. Journal of Combinatorial Theory, 8(1), 134–145.
  6. Rosenthal, R. W. (1973). A class of games possessing pure-strategy Nash equilibria. International Journal of Game Theory, 2(1), 65–67.
  7. Knight, V., Komenda, I., & Griffiths, J. (2017). Measuring the price of anarchy in critical care unit interactions. Journal of the Operational Research Society, 68(6), 630–642.