A lengthy set of support enumeration calculations

In today’s class we worked through the support enumeration algorithm. This involved some discussions about what the algorithm is based on but also a bunch of tedious linear equations.

A recording of the class is available here.

We applied the algorithm described here to the zero sum game defined by:

\[A = \begin{pmatrix} 0 & -1 & 1\\ 1 & 0 & -1\\ -1 & 1 & 0 \end{pmatrix}\]

The basic idea behind the algorithm is as follows:

Assume what actions are played by both players (this is called the supports of the strategies
Identify the strategies that ensure that indeed all actions of the chosen supports will be played: this only happens if all actions in the support itself have the same expected utility.
Check that there is no better outside of the supports chosen.

We considered the supports of size 1: there are no single pairs of actions that are pairs of best responses to each other.

We considered 2 pairs of supports of size 2:

\(I = \{R, P\}\) (the row player only using Rock and Paper) and \(J=\{R, P\}\) (the column player also only using Rock and Paper). This lead to a contradiction (at step 2 of the above general idea) where there would be no probabilities that work.
\(I = \{R, P\}\) (the row player only using Rock and Paper) and \(J=\{P, S\}\) (the column player only using Paper and Scissors). This lead to a contradiction (at step 3 of the above general idea) the row player would benefit from playing Scissors.

That is 2 of the 9 possible pairs of supports of size 2.

We then considered both players playing all 3 strategies which lead to a Nash equilibria of:

\[\sigma_r = (1/3, 1/3, 1/3) \qquad \sigma_c = (1/3, 1/3, 1/3)\]

Indeed in this case we have:

\[\max(A\sigma_c) = \begin{pmatrix}0\\0\\0\end{pmatrix}\]

(the important thing is not that the values are 0 but that they are all the same)

and:

\[\sigma_r A \sigma_c = \max(A\sigma_c)\]

and there are no strategies outside of the support (because they’re all being played) that could do better.

This all corresponds to the best response condition.

With a few more minutes, I wanted to show how to use support enumeration with Nashpy, you can see the video describing this: “Using Python to find Nash Equilibria with Nashpy - YouTube Private”

Here is some code to recreate everything we did today:

import numpy as np
import nashpy as nash

A = np.array(
    (
        (0, -1, 1),
        (1, 0, -1),
        (-1, 1, 0),
    )
)
rps = nash.Game(A)
list(rps.support_enumeration())

A student come up to me after class and made a great point: “this seems incredibly inefficient”: this is absolutely correct. Support enumeration is essentially a “from first principles” algorithm. It is robust but slow. Some alternatives (that are not considered in the class but might be useful to you in your group projects):

Vertex enumeration.
Lemke Howson algorithm: this is essentially state of the art and takes advantage of things computers can do very quickly. BUT it is not guaranteed to obtain all the equilibria.