Profiling and reducing memory consumption in Python

I am one of the core developers of the Axelrod-Python project. This Python library lets you carry out Iterated Prisoner’s dilemma tournaments. One of the great success of the library is the number of strategies it contains, at present (thanks to many awesome contributions) it has 139 strategies (149 if you count the cheaters). This is great but also created a bit of a challenge. Running full tournaments became quite expensive computationally. This is now fixed, thanks mainly to writing and reading to/from disk instead of using memory. This post will describe some tools and techniques that can be used to do this.

Running example

As a running example we’re going to consider calculating the average of a large number of numbers.

>>> N = 5
>>> nbrs = list(range(0, 10 ** N))
>>> total = sum(nbrs)
>>> mean = total / len(nbrs)
>>> mean
49999.5

Profiling

First of all let us take a look at how much memory our code uses. There’s a great tool for this called memory_profiler.

To use it, let’s put the following in a script: profile.py:

from memory_profiler import profile

precision = 10

fp = open('memory_profiler_basic_mean.log', 'w+')
@profile(precision=precision, stream=fp)
def basic_mean(N=5):
    nbrs = list(range(0, 10 ** N))
    total = sum(nbrs)
    mean = sum(nbrs) / len(nbrs)
    return mean

if __name__ == '__main__':
    basic_mean()

After running that file ($python profile.py), the memory_profiler_basic_mean.log file now contains the following:

Filename: profile.py

Line #    Mem usage    Increment   Line Contents
================================================
     6  30.6015625000 MiB   0.0000000000 MiB   @profile(precision=precision, stream=fp)
     7                             def basic_mean(N=5):
     8  34.4492187500 MiB   3.8476562500 MiB       nbrs = list(range(0, 10 ** N))
     9  34.4492187500 MiB   0.0000000000 MiB       total = sum(nbrs)
    10  34.4492187500 MiB   0.0000000000 MiB       mean = total / len(nbrs)
    11  34.4492187500 MiB   0.0000000000 MiB       return mean

The Increment column shows the amount of memory each step takes. In this particular case the largest culprit is getting the list of numbers.

Generators

If you’re using python 3, then range is a generator and not an actual list. To get the list you need to convert it to one as I’ve done above. One of the first things we can do to reduce memory consumption is not convert that nbrs in to a list but keep it as range which is amongst other things a generator. This blog post is a great explanation about generators but the basic idea is that you can create a Python generator which will not immediately create all elements you want. Instead it will keep track of them as you go. In essence this is perfect for memory consumption: by keeping track of where you are and what’s next you don’t have to keep everything in memory.

Filename: profile.py

Line #    Mem usage    Increment   Line Contents
================================================
    14  34.0820312500 MiB   0.0000000000 MiB   @profile(precision=precision, stream=fp)
    15                             def basic_mean_with_gen(N=5):
    16  34.0820312500 MiB   0.0000000000 MiB       nbrs = range(0, 10 ** N)
    17  34.0820312500 MiB   0.0000000000 MiB       total = sum(nbrs)
    18  34.0820312500 MiB   0.0000000000 MiB       mean = total / len(nbrs)
    19  34.0820312500 MiB   0.0000000000 MiB       return mean

Stepping through generators

Now let us assume we didn’t want to calculate the mean number of all the numbers but just of the even ones:

>>> N = 5
>>> nbrs = [n for n in range(0, 10 ** N) if n % 2 == 0]
>>> total = sum(nbrs)
>>> mean = sum(nbrs) / len(nbrs)
>>> mean  # (This is what we expect)
49999.0

If we profile this in a similar way we get an error when it comes to a generator:

Traceback (most recent call last):
...
TypeError: object of type 'generator' has no len()

Indeed, the whole point of a generator is that it cannot know how many things are in it…

So let’s modify the generator approach as follows:

@profile(precision=precision, stream=fp)
def basic_mean_with_gen(N=5):
    nbrs = (n for n in range(0, 10 ** N) if n % 2 == 0)
    total = 0
    count = 0
    for n in nbrs:
        total += n
        count += 1
    mean = total / count
    return mean

The above makes use of the generator comprehension syntax to build the generator. Instead of calculating the sum all in one go we just step through, this might take longer but it will not cost much at all in terms of memory. The profile log now looks like below:

Filename: profile_even.py

Line #    Mem usage    Increment   Line Contents
================================================
30.3281250000 MiB   0.0000000000 MiB   @profile(precision=precision, stream=fp)
                           def basic_mean(N=5):
32.1289062500 MiB   1.8007812500 MiB       nbrs = [n for n in range(0, 10 ** N) if n % 2 == 0]
32.3984375000 MiB   0.2695312500 MiB       total = sum(nbrs)
32.3984375000 MiB   0.0000000000 MiB       mean = total / len(nbrs)
32.3984375000 MiB   0.0000000000 MiB       return mean


Filename: profile_even.py

Line #    Mem usage    Increment   Line Contents
================================================
32.0234375000 MiB   0.0000000000 MiB   @profile(precision=precision, stream=fp)
                           def basic_mean_with_gen(N=5):
32.0234375000 MiB   0.0000000000 MiB       nbrs = (n for n in range(0, 10 ** N) if n % 2 == 0)
32.0234375000 MiB   0.0000000000 MiB       total = 0
32.0234375000 MiB   0.0000000000 MiB       count = 0
32.0234375000 MiB   0.0000000000 MiB       for n in nbrs:
32.0234375000 MiB   0.0000000000 MiB           total += n
32.0234375000 MiB   0.0000000000 MiB           count += 1
32.0234375000 MiB   0.0000000000 MiB       mean = total / count
32.0234375000 MiB   0.0000000000 MiB       return mean

One other aspect that we could do but it’s not going to show up substantially with our little example here is to write the numbers to file and calculate the means as they are read in. This is in fact how we fixed the high memory consumption on the Axelrod project. If you’re interested in seeing a real example you can see the PR that fixed that here: https://github.com/Axelrod-Python/Axelrod/pull/672. In essence we had to do similar things like I did above, instead of calculating the mean in one go we walk through the file and calculate all the various results as we go.

TLDR

Stay out of memory: use generators (and/or use files).

Of course, it’s important to not over optimise on just one dimension. If your memory consumption is not an actual constraint then perhaps don’t worry about it.