Further information#

What are the differences between recursion and iteration?#

When giving instructions to a computer it is possible to use recursion to directly implement a common mathematical definition. For example consider the following sequence:

\[\begin{split} \left\{\begin{array}{l} a_1 = 1\\ a_{n + 1}= 3a_n, n > 1 \end{array}\right. \end{split}\]

We can define this in Python as follows:

def generate_sequence(n):
    """
    Generate the sequence defined by:

    a_1 = 1
    a_n = 3 a_{n - 1}

    This is done using recursion.
    """
    if n == 1:
        return 1
    return 3 * generate_sequence(n - 1)

The first 6 terms:

[generate_sequence(n) for n in range(1, 7)]
[1, 3, 9, 27, 81, 243]

We note that in this case this corresponds to powers of \(3\), and indeed we can prove that: \(a_n = 3 ^ {n - 1}\). We will not carry out the proof here but one approach to doing it would be to use proof by induction which is closely related to recursive functions.

We can write a different python function that uses this formulae. This is called iteration:

def calculate_sequence(n):
    """
    Calculate the nth term of the sequence defined by:

    a_1 = 1
    a_n = 3 a_{n - 1}

    This is done using iteration using the direct formula:

    a_n = 3 ^ n
    """
    return 3 ** (n - 1)
[calculate_sequence(n) for n in range(1, 7)]
[1, 3, 9, 27, 81, 243]

We can in fact use a Jupyter magic command to time the run time of a command. It is clear that recursion is slower.

%timeit [generate_sequence(n) for n in range(1, 25)]
28.9 µs ± 35.5 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit [calculate_sequence(n) for n in range(1, 25)]
7.97 µs ± 71.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In practice:

  • Using recursion is powerful as it can be used to directly implement recursive definitions.

  • Using iteration is more computationally efficient but it is not always straightforward to obtain an iterative formula.

What is caching#

One of the reasons that recursion is computationally inefficient is that it always has to recalculate previously calculated values.

For example:

\[\begin{split} \left\{\begin{array}{l} a_1 = 1\\ a_{n + 1}= 3a_n, n > 1 \end{array}\right. \end{split}\]

One way to overcome this is to use caching which means that when a function is called for a value it has already computed it remembers the value.

Python has a caching tool available in the functools library:

import functools


def generate_sequence(n):
    """
    Generate the sequence defined by:

    a_1 = 1
    a_n = 3 a_{n - 1}

    This is done using recursion.
    """
    if n == 1:
        return 1
    return 3 * generate_sequence(n - 1)


@functools.lru_cache()
def cached_generate_sequence(n):
    """
    Generate the sequence defined by:

    a_1 = 1
    a_n = 3 a_{n - 1}

    This is done using recursion but also includes a cache.
    """
    if n == 1:
        return 1
    return 3 * cached_generate_sequence(n - 1)

Timing both these approaches confirms a substantial increase in time for the cached version.

%timeit [generate_sequence(n) for n in range(1, 25)]
28.6 µs ± 113 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit [cached_generate_sequence(n) for n in range(1, 25)]
1.82 µs ± 19.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)