Further information#

What are the differences between recursion and iteration?#

When giving instructions to a computer it is possible to use recursion to directly implement a common mathematical definition. For example consider the following sequence:

\[\begin{split} \left\{\begin{array}{l} a_1 = 1\\ a_{n + 1}= 3a_n, n > 1 \end{array}\right. \end{split}\]

You can define this in Python as follows:

def generate_sequence(n):
    """
    Generate the sequence defined by:

    a_1 = 1
    a_n = 3 a_{n - 1}

    This is done using recursion.
    """
    if n == 1:
        return 1
    return 3 * generate_sequence(n - 1)

The first 6 terms:

[generate_sequence(n) for n in range(1, 7)]
[1, 3, 9, 27, 81, 243]

In this case this corresponds to powers of \(3\), and indeed you can prove that: \(a_n = 3 ^ {n - 1}\). The proof is not given here but one approach to doing it would be to use induction which is closely related to recursive functions.

You can write a different python function that uses this formula. This is called iteration:

def calculate_sequence(n):
    """
    Calculate the nth term of the sequence defined by:

    a_1 = 1
    a_n = 3 a_{n - 1}

    This is done using iteration using the direct formula:

    a_n = 3 ^ n
    """
    return 3 ** (n - 1)
[calculate_sequence(n) for n in range(1, 7)]
[1, 3, 9, 27, 81, 243]

You can in fact use a Jupyter magic command to time the run time of a command. It is clear that recursion is slower.

%timeit [generate_sequence(n) for n in range(1, 25)]
31.4 μs ± 180 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit [calculate_sequence(n) for n in range(1, 25)]
8.34 μs ± 31.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In practice:

  • Using recursion is powerful as it can be used to directly implement recursive definitions.

  • Using iteration is more computationally efficient but it is not always straightforward to obtain an iterative formula.

What is caching?#

One of the reasons that recursion is computationally inefficient is that it always has to recalculate previously calculated values.

For example:

\[\begin{split} \left\{\begin{array}{l} a_1 = 1\\ a_{n + 1}= 3a_n, n > 1 \end{array}\right. \end{split}\]

One way to overcome this is to use caching which means that when a function is called for a value it has already computed it remembers the value. Python has a caching tool available in the functools library:

import functools


def generate_sequence(n):
    """
    Generate the sequence defined by:

    a_1 = 1
    a_n = 3 a_{n - 1}

    This is done using recursion.
    """
    if n == 1:
        return 1
    return 3 * generate_sequence(n - 1)


@functools.lru_cache()
def cached_generate_sequence(n):
    """
    Generate the sequence defined by:

    a_1 = 1
    a_n = 3 a_{n - 1}

    This is done using recursion but also includes a cache.
    """
    if n == 1:
        return 1
    return 3 * cached_generate_sequence(n - 1)

Timing both these approaches confirms a substantial increase in time for the cached version.

%timeit [generate_sequence(n) for n in range(1, 25)]
30 μs ± 22.1 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit [cached_generate_sequence(n) for n in range(1, 25)]
1.91 μs ± 26.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)