Prisoners dilemma - exercises¶

1. Give the general definition for a Prisoners dilemma.

2. Justify if the following games are Prisoners dilemmas or not:

1. $A = \begin{pmatrix} 3 & 0\\ 5 & 1 \end{pmatrix} \qquad B = \begin{pmatrix} 3 & 5\\ 0 & 1 \end{pmatrix}$

This is a Prisoners Dilemma: $(R, S, T, P) = (3, 1, 5, 0)$.

2. $A = \begin{pmatrix} 1 & -1\\ 2 & 0 \end{pmatrix} \qquad B = \begin{pmatrix} 1 & 2\\ -1 & 0 \end{pmatrix}$

This is a Prisoners Dilemma: $(R, S, T, P) = (1, -1, 2, 0)$: $2>1>0>-1$ and $2\times 1 > 2 - 1$.

3. $A = \begin{pmatrix} 1 & -1\\ 2 & 0 \end{pmatrix} \qquad B = \begin{pmatrix} 3 & 5\\ 0 & 1 \end{pmatrix}$

This is not a Prisoner's Dilemma $A \ne B ^ T$

4. $A = \begin{pmatrix} 6 & 0\\ 12 & 1 \end{pmatrix} \qquad B = \begin{pmatrix} 6 & 12\\ 0 & 0 \end{pmatrix}$

This is not a Prisoner's Dilemma $A \ne B ^ T$

3. Obtain the Markov chain representation for a match between reactive strategies with the following vectors:

Bookwork: this is a substitution exercise using https://vknight.org/gt/chapters/09/#Markov-chain-representation-of-a-Match-between-two-reactive-strategies

1. $p=(1/2, 1/2)\qquad q=(1/2, 1/2)$

$$M = \begin{pmatrix} 1/4&1/4&1/4&1/4\\ 1/4&1/4&1/4&1/4\\ 1/4&1/4&1/4&1/4\\ 1/4&1/4&1/4&1/4 \end{pmatrix}$$

2. $p=(1/4, 1/2)\qquad q=(1/2, 1/4)$

$$M = \begin{pmatrix} 1/8&1/8&3/8&3/8\\ 1/4&1/4&1/4&1/4\\ 1/16&3/16&3/16&9/16\\ 1/8&3/8&1/8&3/8 \end{pmatrix}$$

3. $p=(1/3, 1/3)\qquad q=(2/3, 1/4)$

$$M = \begin{pmatrix} 2/9 & 1/9 & 4/9 & 2/9\\ 2/9 & 1/9 & 4/9 & 2/9\\ 1/12 & 1/4 & 1/6 & 1/2\\ 1/12 & 1/4 & 1/6 & 1/2 \end{pmatrix}$$

4. Obtain the utilities for both players for the vectors of question 3.

Bookwork: this is a substitution exercise using https://vknight.org/gt/chapters/09/#Theorem:-steady-state-probabilities-for-match-between-reactive-players

1. $p=(1/2, 1/2)\qquad q=(1/2, 1/2)$ gives utilities: $(9/4, 9/4)$
2. $p=(1/4, 1/2)\qquad q=(1/2, 1/4)$ gives utilities: $(536/289, 621/289)$
3. $p=(1/3, 1/3)\qquad q=(2/3, 1/4)$ gives utilities: $(113/54, 49/27)$

Here is some python code that also carries out these calculations:

In [1]:
import numpy as np
import itertools

def make_matrix(p, q):
"""
Code to obtain Markov chain representation of match between two reactive players.
"""
M = [[ele[0] * ele[1] for ele in itertools.product([player, 1 - player],
[opponent, 1 - opponent])]
for opponent in q for player in p]
return np.array(M)

r_1 = p[0] - p[1]
r_2 = q[0] - q[1]
s_1 = (q[1] * r_1 + p[1]) / (1 - r_1 * r_2)
s_2 = (p[1] * r_2 + q[1]) / (1 - r_1 * r_2)
return np.array([s_1 * s_2, s_1 * (1 - s_2), (1 - s_1) * s_2, (1 - s_1) * (1 - s_2)])

def theoretic_utility(p, q, rstp=np.array([3, 0, 5, 1])):
return np.dot(pi, rstp)

In [2]:
import sympy as sym
for p, q in [([sym.S(1) / 2, sym.S(1) / 2], [sym.S(1) / 2, sym.S(1) / 2]),
([sym.S(1) / 4, sym.S(1) / 2], [sym.S(1) / 2, sym.S(1) / 4]),
([sym.S(1) / 3, sym.S(1) / 3], [sym.S(2) / 3, sym.S(1) / 4])]:
print("=====")
print(p, q)
print("gives:")
print(make_matrix(p, q))
print("With utility:", theoretic_utility(p, q), theoretic_utility(q, p))

=====
[1/2, 1/2] [1/2, 1/2]
gives:
[[1/4 1/4 1/4 1/4]
[1/4 1/4 1/4 1/4]
[1/4 1/4 1/4 1/4]
[1/4 1/4 1/4 1/4]]
With utility: 9/4 9/4
=====
[1/4, 1/2] [1/2, 1/4]
gives:
[[1/8 1/8 3/8 3/8]
[1/4 1/4 1/4 1/4]
[1/16 3/16 3/16 9/16]
[1/8 3/8 1/8 3/8]]
With utility: 536/289 621/289
=====
[1/3, 1/3] [2/3, 1/4]
gives:
[[2/9 1/9 4/9 2/9]
[2/9 1/9 4/9 2/9]
[1/12 1/4 1/6 1/2]
[1/12 1/4 1/6 1/2]]
With utility: 113/54 49/27


5. Assuming $p=(x, 1/2)$, find the optimal $x$ against the following players:

A part of this question involves bookwork: this is a substitution exercise using https://vknight.org/gt/chapters/09/#Theorem:-steady-state-probabilities-for-match-between-reactive-players. This is a substitution exercise to obtain a formula for the utility of $p$ as a function of $x$.

1. $q=(1, 0)$

$$u(x)=\frac{(-10x + 4(x - 1)^2 + 13)}{(2x - 3)^2}$$

The derivative of this function is given by:

$$\frac{2(6x - 7)}{(2x - 3)^3}$$

This derivative has zero for $x=7/6$ which is $>1$. Thus the utility is monotic increasing over the interval $[0, 1]$. We have (by substitution):

$$u(0)=17/9\qquad u(1)=3$$

Thus $u(x)$ is an increasing function so the optimal value of $x$ is $1$.

Against a player that is unforgiving (reacts to defection with defection), given that our player will play randomly against a defection it is better to always cooperate.

2. $q=(1/2, 1/2)$

$$u(x)=-3x/4+21/8$$

This is a decreasing function so the optimal value of $x$ is $0$.

Against a random player (who takes no notice of what we do) it is better to defect.

Below is some code to verify this calculations.

In [3]:
x = sym.Symbol("x")
for q in [(sym.S(1), sym.S(0)), (sym.S(1) / 2, sym.S(1) / 2)]:
utility = theoretic_utility((x, sym.S(1) / 2), q)
print(utility.simplify(), utility.subs({x: 0}), utility.subs({x: 1}), sym.diff(utility, x).simplify(), sym.solveset(sym.diff(utility, x), x))

(-10*x + 4*(x - 1)**2 + 13)/(2*x - 3)**2 17/9 3 2*(6*x - 7)/(2*x - 3)**3 {7/6}
-3*x/4 + 21/8 21/8 15/8 -3/4 EmptySet()

Previous

Next