An $N$ player normal form game consists of:

- A finite set of $N$ players
- Strategy spaces for the players: $\{S_1,S_2,S_3,\dots,S_N\}$;
- Payoff functions for the players: $u_i:S_1\times S_2\dots\times S_N\to\mathbb{R}$

[2]

We have:

- $u_r((1, 0), (y, 1-y))= y - 2 (1 - y)=3y-2$ and $u_r((0, 1), (y, 1-y))= -2y + (1 - y)=1-3y$
- $u_c((x, 1 - x), (1, 0))=-2x+(1-x)=1-3x$ and $u_c((x, 1- x), (0, 1))=x-2(1-x)=3x-2$

In [1]:

```
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
plt.figure()
ys = [0, 1]
plt.plot(ys, [3 * y - 2 for y in ys], label="$u_r((1, 0), (y, 1-y))$")
plt.plot(ys, [1 - 3 * y for y in ys], label="$u_r((0, 1), (y, 1-y))$")
plt.legend()
plt.xlabel("$y$")
plt.title("Utility to row player");
```

[1]

In [2]:

```
plt.figure()
xs = [0, 1]
plt.plot(xs, [1 - x * 3 for x in xs], label="$u_c((x, 1 - x), (1, 0))$")
plt.plot(xs, [3 * x - 2 for x in xs], label="$u_c((x, 1 - x), (0, 1))$")
plt.legend()
plt.xlabel("$x$")
plt.title("Utility to column player");
```

[1]

The best responses are then given by:

$$ \sigma_r^* = \begin{cases} (0,1)&\text{ if }y< 1/2\\ (1,0)&\text{ if }y> 1/2\\ \text{indifferent}&\text{ otherwise} \end{cases} $$$$ \sigma_c^* = \begin{cases} (1,0)&\text{ if }x< 1/2\\ (0,1)&\text{ if }x> 1/2\\ \text{indifferent}&\text{ otherwise} \end{cases} $$[1]

$(A\sigma_c^T)_i$ is the utility of the row player when they play their $i$th strategy. Thus:

$$\sigma_rA\sigma_c^T=\sum_{i=1}^{m}{\sigma_r}_i(A\sigma_c^T)_i$$[2]

Let $u=\max_{k}(A\sigma_c^T)_k$. Thus:

$$ \begin{aligned} \sigma_rA\sigma_c^T&=\sum_{i=1}^{m}{\sigma_r}_i(u - u + (A\sigma_c^T)_i)\\ &=\sum_{i=1}^{m}{\sigma_r}_iu - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)\\ &=u - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i) \end{aligned}$$[2]

We know that $u - (A\sigma_c^T)_i\geq 0$, thus the largest $\sigma_rA\sigma_c^T$ can be is $u$ which occurs iff ${\sigma_r}_i > 0 \Rightarrow (A\sigma_c^T)_i = u$ as required.

[1]

There are no pairs of **pure** strategies that are best response to each other (visibile from the payoff matrices).
We consider mixed strategies $\sigma_r=(x, 1- x)$ and $\sigma_c=(y, 1-y)$ [1]

For $\sigma_r$ to be a best response to $\sigma_c$, from the theorem we have:

$$u_r((1, 0), \sigma_c)=u_r((1, 0), \sigma_c)\Rightarrow 3y-2=1-3y\Rightarrow y=1/2$$$$u_c(\sigma_r, (1, 0))=u_c(\sigma_r, (0, 1))\Rightarrow 1-3x=3x-2\Rightarrow x=1/2$$[2]

The Nash equilibrium is $((1/2, 1/2), (1/2, 1/2))$.

This is confirmed by the best response findings of question (b): these two strategies are best responses to each other. [1]

- (i): This paper builds a Markov queueing model based on strategic interactions between two hospitals [1]. It measures the Price of Anarchy to see the effect on patients of rational behaviour [1]. It does all this by comparing the Nash equilibrium to the optimal behaviour [1].
- (ii): The main theoretic result is concerning the behaviour of the best response functions. [1] The paper proves that they are increasing functions which ensures there is a single point of intersection. [1]
- (iii): Various possibilities each worth [2]:
- A stationary queueing model whilst in practice a time dependent model would be more appropriate.
- Only considering interaction between two hospitals.
- Just a single cutoff strategy.

- (iv): Depending on the choice of limiting factor [3]:
- Use a non stationary queueing model, potentially simulating the queueing process in a time dependent way.
- Consider a multi player game although this becomes harder to generalise the theoretic result.
- Use multiple cutoff strategies (this would be the simplest approach).

Given a two player game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, referred to as a **stage** game, a $T$-stage repeated game is a game in which players play that stage game for $T > 0$ periods. Players make decisions based on the full history of play over all the periods.

[2]

Given a two player stage game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, repeated to give a $T$-stage repeated game. A strategy is a mapping from the entire history of play to an action of the stage game:

$$ \bigcup_{t=0}^{T-1}H(t)\to a $$where:

- $H(t)$ is the history of player of
**both**players up until stage $t$ ($H(0)=(\emptyset, \emptyset)$) - $a$ is an action (for either player) of the
**stage**game

[2]

For a given integer value of $0\leq t< T$ there are $|S_1|^t$ possible histories for player 1 and $|S_2|^t$ possible histories for player 2. Thus the total number of histories is: [2]

$$\sum_{t=0}^{T - 1}|S_1|^t|S_2|^t$$[1]

which gives:

$$\sum_{t=0}^{T - 1}|S_1|^t|S_2|^t=\sum_{t=0}^{T - 1}(|S_1||S_2|)^t=\frac{1 - (|S_1||S_2|)^T}{1 - |S_1||S_2|}$$[2]

$S_1\{r_1, r_2\}$, $S_2=\{c_1, c_2, c_3\}$ and $T=2$

$$\{(\emptyset, \emptyset), (r_1, c_1), (r_1, c_2), (r_1, c_3), (r_2, c_1), (r_2, c_2), (r_2, c_3)\}$$[3]

Consider the following strategy:

The row/column player should play action $a_{r/c}$ regardless of the play of any previous strategy profiles. [2]

where $(a_{r}, a_{c})$ is a given stage Nash equilibrium.

Using backwards induction, this is a Nash equilibrium for the last stage game. Thus, at the last stage, no player has a reason to deviate. Similarly at the $T-1$th stage. The proof follows.

[2]

The pure Nash equilibria:

$$A = \begin{pmatrix}\underline{3} & \underline{6} & \underline{1}\\1 & 2 & 0\\\end{pmatrix} \qquad B = \begin{pmatrix}0 & \underline{7} & \underline{7}\\\underline{20} & 1 & 0\\\end{pmatrix}$$Thus, for our example we have the four Nash equilibria:

- $(r_1r_1, c_2c_2)$ with utility: (12, 14). [1]
- $(r_1r_1, c_2c_3)$ with utility: (7, 14). [1]
- $(r_1r_1, c_3c_2)$ with utility: (7, 14). [1]
- $(r_1r_1, c_3c_3)$ with utility: (2, 14). [1]

Consider the following two strategies:

For the row player:

$$(\emptyset, \emptyset) \to r_2$$ $$(r_2, c_1) \to r_1$$ $$(r_2, c_2) \to r_1$$ $$(r_2, c_3) \to r_1$$

[2]

For the column player:

$$(\emptyset, \emptyset) \to c_1$$ $$(r_1, c_1) \to c_3$$ $$(r_2, c_1) \to c_2$$

[2]

This is a Nash equilibria because:

- If the row player deviates, they would only be rational to do so in the first stage, if they did they would gain 2 in that stage but lose 5 in the second stage. Thus they have no incentive to deviate.
- If the column player deviates, they would only do so in the first stage and gain no utility.

[2]

For a two player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the row/column player best response polytope $\mathcal{P}$/$\mathcal{Q}$ is defined by:

[1]

$$ \mathcal{P} = \left\{x\in\mathbb{R}^{m}\;|\;x\geq 0; xB\leq 1\right\} $$[1]

$$ \mathcal{Q} = \left\{y\in\mathbb{R}^{n}\;|\; Ay\leq 1; y\geq 0 \right\} $$[1]

The column player has two best responses to the first row thus the game is degenerate. [1]

Directly apply the definition: [1]

Note that any modifications of $A, B$ are accepted (so as to make them $>0$).

For $\mathcal{P}$ :

$$ \begin{aligned} x_1 &\geq 0\\ x_2 &\geq 0\\ 20x_2 & \leq 1\\ 7 x_1 + x_2 & \leq 1\\ 7 x_1& \leq 1\\ \end{aligned} $$[2]

For $\mathcal{Q}$ :

$$ \begin{aligned} 3y_1 +6y_2 + y_3 & \leq 1\\ y_1 +2y_2 & \leq 1\\ y_1 &\geq 0\\ y_2 &\geq 0\\ y_3 &\geq 0\\ \end{aligned} $$[2]

There are an infinite number of possible vertices (based on potential modifications of $A, B$). Thus it makes sense to consider the strategies that correspond to the normalised vertices given (recall the question is not asking to find the vertices):

For $\mathcal{P}$:

- $x=(0, 0)\text{ with labels: }\{0, 1\}$: $x_1=0$ and $x_2=0$
- $x=(1/7, 0)\rightarrow \sigma_r=(1, 0)\text{ with labels: }\{1, 3, 4\}$. Label $1$ because $x_2=0$. Labels $\{3, 4\}$ because the 2nd and 3rd columns are best responses.
- $x=(0, 1/20)\rightarrow \sigma_r=(0, 1)\text{ with labels: }\{0, 2\}$: Label $0$ because $x_1=0$. Label $2$ because the first column is a best response.
- $x=(19/140, 1/20)\rightarrow \sigma_r=(19/26, 7/26)\text{ with labels: }\{2, 3\}$: we have $\sigma_r B = (70/13, 70/13, 133/26)$ thus columns 1 and 2 are best responses.

[2]

For $\mathcal{Q}$:

- $y=(0, 0, 0)\text{ with labels: }\{2, 3, 4\}$: $y_1=0$, $y_2=0$ and $y_3=0$
- $y=(0, 0, 1)\rightarrow \sigma_c=(0, 0, 1)\text{ with labels: }\{0, 2, 3\}$: 2, 3 is immediate, 0 because best response to 3rd column is first row.
- $y=(0, 1/6, 0)\rightarrow \sigma_c=(0, 1, 0)\text{ with labels: }\{0, 2, 4\}$: 2, 4 is immediate, 0 because best response to 2nd column is first row.
- $y=(1/3, 0, 0)\rightarrow \sigma_c=(1, 0, 0)\text{ with labels: }\{0, 3, 4\}$: 3, 4 is immediate, 0 because best response to 1st column is first row.

[2]

In [3]:

```
import nashpy as nash
A = np.array([[3, 6, 1],
[1, 2, 0]])
B = np.array([[0, 7, 7],
[20, 1, 0]])
row_halfspaces = nash.polytope.build_halfspaces(B.transpose())
for vertex, labels in nash.polytope.non_trivial_vertices(row_halfspaces):
print(np.round(vertex / sum(vertex), 4))
```

In [4]:

```
col_halfspaces = nash.polytope.build_halfspaces(A)
for vertex, labels in nash.polytope.non_trivial_vertices(col_halfspaces):
print(np.round(vertex / sum(vertex), 4))
```

For a nondegenerate 2 player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the following algorithm returns all nash equilibria:

- For all pairs of vertices of the best response polytopes
- Check if the vertices have full labels
- Return the normalised probabilities

[2]

The only vertex pairs with a full set of labels:

- $x=(1/7, 0)\text{ with labels: }\{1, 3, 4\}$ and $y=(0, 0, 1)\text{ with labels: }\{0, 2, 3\}$ [2]
- $x=(1/7, 0)\text{ with labels: }\{1, 3, 4\}$ and $y=(0, 1/6, 0)\text{ with labels: }\{0, 2, 4\}$ [2]

This corresponds to:

$$\{((1, 0), (0, 0, 1)), ((1, 0), (0, 1, 0))\}$$[1]

In [5]:

```
game = nash.Game(A, B)
for eq in game.vertex_enumeration():
print([np.round(s, 2) for s in eq])
```

For a nondegenerate 2 player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the following algorithm returns a nash equilibria:

- Start at the artificial equilibrium: $(0, 0)$
- Choose a label to drop.
- Remove this label from the corresponding vertex by traversing an edge of the corresponding polytope to another vertex.
- The new vertex will now have a duplicate label in the other polytope. Remove this label from the vertex of the other polytope and traverse an edge of that polytope to another vertex.
- Repeat step 4 until the pair of vertices is fully labelled.

Note that the game is degenerate but that we can still attempt to use the algorithm:

- $((0, 0), (0, 0, 0))$ have labels: $\{0, 1\}, \{2, 3, 4\}$. Drop 0 (arbitrary decision) in $\mathcal{P}$. [1]
- $\to ((1/7, 0), (0, 0, 0))$ have labels: $\{1, 3, 4\}, \{2, 3, 4\}$. In $\mathcal{Q}$ drop 3 (arbitrary decision, could have dropped 4). [1]
- $\to ((1/7, 0), (0, 1/6, 0))$ have labels: $\{1, 3, 4\}, \{0, 2, 4\}$. [1]

We have a fully labelled vertex pair (and one of the same equilibria as before).

There are numerous possible implimentations of this algorithm, here is another:

- $((0, 0), (0, 0, 0))$ have labels: $\{0, 1\}, \{2, 3, 4\}$. Drop 2 (arbitrary decision) in $\mathcal{Q}$.
- $\to ((0, 0), (1/3, 0, 0))$ have labels: $\{0, 1\}, \{0, 3, 4\}$. In $\mathcal{P}$ drop 0.
- $\to ((1/7, 0), (1/3, 0, 0))$ have labels: $\{1, 3, 4\}, \{0, 3, 4\}$. In $\mathcal{Q}$ drop 4 (arbitrary decision, could have dropped 3).
- $\to ((1/7, 0), (0, 0, 1))$ have labels $\{1, 3, 4\}, \{0, 2, 3\}$

We have a fully labelled vertex pair (and one of the same equilibria as before).

Consider a matrix $A\in\mathbb{R}^{m\times n}$ representing a game with two strategies.

$$ A= \begin{pmatrix} a & b\\ c & d \end{pmatrix} $$The Moran process is as follows:

- At a given time step: all individuals play all other individuals.
- Obtain their fitness as given by the game.
- Randomly select an individual proportional to their fitness as an individual to be reproduced
- Uniformly select an individual to be replaced
- Proceed to the next time step.
- The process terminates when there is only one type of individual in the population.

[4]

Given a birth death process, the fixation probability $x_i$ is given by:

$$x_i=\frac{1+\sum_{j=1}^{i-1}\prod_{k=1}^j\gamma_k}{1+\sum_{j=1}^{N-1}\prod_{k=1}^j\gamma_k}$$where:

$$ \gamma_k = \frac{p_{k,k-1}}{p_{k,k+1}} $$[2]

The Moran process is a birth death process where the transition probabilities are then given by:

$$p_{i,i+1}=\frac{if_{1i}}{if_{1i} + (N-i)f_{2i}}\frac{N-i}{N}$$$$p_{i,i-1}=\frac{(N-i)f_{2i}}{if_{1i} + (N-i)f_{2i}}\frac{i}{N}$$[2]

which gives:

$$\gamma_i=\frac{f_{2i}}{f_{1i}}$$[1]

thus (using the general birth death process result):

$$ x_i=\frac{1+\sum_{j=1}^{i-1}\prod_{k=1}^j\frac{f_{2k}}{f_{1k}}}{1+\sum_{j=1}^{N-1}\prod_{k=1}^j\frac{f_{2k}}{f_{1k}}} $$[1]

Assuming $i$ individuals of the first type, for this game we have $N=3$ and $(a, b, c, d)=(0, 2, r, 0)$ the fitness of both types is given respectively by:

$$f_{1i}=\frac{a(i-1)+b(N-i)}{N-1}=\frac{6-2i}{2}=3-i$$$$f_{2i}=\frac{c(i)+d(N-i-1)}{N-1}=\frac{ri}{2}=\frac{ri}{2}$$which gives:

$$\gamma_i=\frac{f_{2i}}{f_{1i}}=\frac{ri}{6-2i}$$thus:

$$ x_1=\frac{1}{1+\sum_{j=1}^{2}\prod_{k=1}^j\frac{rk}{6-2k}}=\frac{1}{1 + \frac{r}{4} + \frac{r}{4}\frac{2r}{2}}=\frac{1}{\frac{r^{2}}{4} + \frac{r}{4} + 1} $$for $r=2$ we get:

$$ x_1 = 2 / 5 $$Some code to verify the result:

In [6]:

```
def theoretic_fixation(N, game, i=1):
"""
Calculate x_i as given by the above formula
"""
f_ones = np.array([(game[0, 0] * (i - 1) + game[0, 1] * (N - i)) / (N - 1) for i in range(1, N)])
f_twos = np.array([(game[1, 0] * i + game[1, 1] * (N - i - 1)) / (N - 1) for i in range(1, N)])
gammas = f_twos / f_ones
return (1 + np.sum(np.cumprod(gammas[:i-1]))) / (1 + np.sum(np.cumprod(gammas)))
```

In [7]:

```
import sympy as sym
import numpy as np
r = sym.symbols("r")
game = np.array([[sym.S(0), sym.S(2)], [sym.S(r), sym.S(0)]])
x_1 = theoretic_fixation(N=3, game=game)
x_1, x_1.subs({r: 2})
```

Out[7]:

Finding $r$ such that $x_1>.9$ corresponds to:

$$\frac{1}{\frac{r^{2}}{4} + \frac{r}{4} + 1}>9/10$$which corresponds to:

$$ \begin{aligned} \left(\frac{r^{2}}{4} + \frac{r}{4} + 1\right)9/10 & < 1 &&\text{[1]}\\ \left(\frac{r^{2}}{4} + \frac{r}{4} + 1\right) & < 10/9 && \text{[1]}\\ \frac{r^{2}}{4} + \frac{r}{4} - \frac{1}{9}& < 0 && \text{[1]} \\ r ^ 2 + r - \frac{4}{9}& < 0 && \text{[1]} \\ \end{aligned}$$The roots of this polynomial are $-4/3, 1/3$ [2], thus: $r<1/3$ ensures $x_1>9/10$. [1]

To ensure a high probability of fixation we need the fitness of an individual of the second type encountering an individual of the first type to be at most $1/3$. [1]

In [8]:

```
sym.solveset(r ** 2 + r - sym.S(4) / 9, r)
```

Out[8]:

Source code: @drvinceknight Powered by: Python Mathjax Github pages Skeleton css