A new preprint: ‘Reviving, reproducing, and revisiting Axelrod’s second tournament’

A new preprint: 'Reviving, reproducing, and revisiting Axelrod's second tournament'

Collaborators and I have just put a new preprint up on the arXiv: “Reviving, Reproducing and Revisiting Axelrod’s second tournament”. I’m really proud of this paper.

Background

The Prisoner’s Dilemma is a classic model of direct reciprocity—how cooperation can emerge between individuals who repeatedly interact.

In each interaction, both players face a simple choice: act selfishly for an immediate gain, or selflessly for a shared long-term benefit. If both act selflessly, they each do better overall, but the temptation to act selfishly remains strong.

When interactions repeat, cooperation becomes possible through direct reciprocity:

I’ll cooperate with you if you’ve cooperated with me before.

A recent Veritasium video gives a great overview of this topic and includes an interview with Robert Axelrod. Axelrod, a political scientist, ran two computer tournaments in the 1980s to study the evolution of cooperation. He’s also the namesake of the axelrod Python library, which I help maintain.

The key finding from these tournaments was that the strategy Tit for Tat performed best and, more importantly, revealed the traits that make for strong performance:

Be nice (don’t defect first)
Be provocable (retaliate when necessary)
Be forgiving (return to cooperation after conflict)
Be non-envious (don’t aim to outperform others)
Be simple (easy to understand)

This work has been hugely influential. Axelrod’s book The Evolution of Cooperation (1984) has been cited tens of thousands of times across political science, biology, and economics.

Reviving the work

When I started the axelrod Python library in 2015, I exchanged emails with Robert Axelrod to ask whether any of the original source code from his tournaments still existed:

“I don’t have the first round. But the second round is on-line.
Best of success with your library.”

A version of the original Fortran code—annotated in 1993—can indeed be found here.

This is where Owen Campbell, another maintainer of the axelrod library, comes in. He worked to revive the original Fortran code so that it could run within the modern Python framework.

Reproducing the work

Using this, we were able to attempt to reproduce the original results—and, in doing so, fail to perfectly replicate them. Owen gave a great talk about this at PyCon UK, titled God Is Real (Unless Declared an Integer).

I say fail because one particular strategy performed unexpectedly poorly. However, we did reproduce the main conclusions: Tit for Tat again performed best, and the same core principles of cooperation emerged.

This apparent failure is, in fact, interesting—it highlights broader issues around the reproducibility of computational research. As a Fellow of the Software Sustainability Institute, this is a topic close to my heart and central to this work.

A key output of our preprint is a fully reproducible and reusable package: the original Fortran code, wrapped and documented so that anyone can rerun the experiments and verify the results for themselves.

Revisiting the Work

Amidst all this work on Axelrod’s second tournament, development of the
axelrod Python library has continued to grow and mature:

Used in more than 40 pieces of research
Over 785 stars, 279 forks, and 76 contributors on GitHub

I genuinely believe this library stands as an example of good scientific practice—
specifically because it is open. Contributions and expertise have come from a
remarkably broad community, including students. One such contributor, Julie Rymer,
shared the following message after working on the library as part of her studies:

“And I really wanted to thank you all. I discovered your project because
of a course where we needed to participate in an open source project,
and I had the occasion to compare the welcome me and my coworkers
received here compared to other people from my class who worked on
different projects. And I’ve got to say you are awesome on that part
and on the help you provide to newbies. I like your project so I’ll try to
continue to contribute now and then!”

This diversity of contributors has become one of the project’s greatest strengths.
There are now more than 250 distinct strategies implemented in the library: some trained using reinforcement learning techniques described in the literature,
others inspired by creative theoretical ideas.

Together, this rich ecosystem allows us, in our preprint, to address a significant question:

How generalisable were the conclusions of Robert Axelrod?

We ran a large tournament including every available strategy: by far the largest
Iterated Prisoner’s Dilemma tournament ever conducted.

Unsurprisingly, Tit For Tat did not win. In fact, the highest performer from
Axelrod’s second tournament ranked only 16th in our results. Almost all of
the top fifteen strategies were sophisticated ones trained using reinforcement
learning techniques, having learned to behave in contextually beneficial ways
(described in this paper).

Our tournament also reinforces findings from more
recent research,
which highlights the following properties as key to effective collaboration:

Be a little envious.
Be nice in non-noisy environments or when games are long (if known).
Reciprocate both cooperation and defection appropriately: be provocable in short
matches and generous in noisy settings.
It is acceptable to be clever.
Adapt to the environment; adjust to the mean tournament cooperation level.

We also experimented with modified versions of Axelrod’s original tournament,
introducing one, two, three, or four new entrants. This produced perhaps our most
significant finding: Tit For Tat continued to win a large proportion of these
tournaments.

This does not mean that Tit For Tat was strong: in fact quite the opposite.
The environment of Axelrod’s second tournament was highly specific and
particularly well-suited to Tit For Tat.

Generalising from that tournament is a mistake.

Axelrod’s tournament was an attempt to approximate an unimaginably vast space of
strategies on the order of $2^{10^{120}}$ (a gazillion). Even the
tournament we conducted here can only ever approximate that immense landscape of
possible behaviours.

I believe this is an important piece of work:

We have reproduced (almost!) an influential study that shaped an entire field,
with applications to war, politics, trade, environmental cooperation to name a few.
We extend the work by considering specific new contexts.
We show that the conclusions of that work were generalised perhaps too freely.
And we have done so in a reproducible, sustainable, and open manner.