Un peu de math

An overview of the awesome `zip` generator in Python.

Iterating over many lists at the same time in Python with `zip`

2018-11-23

When programming it is often necessary to iterate over two (or more) lists. This post will describe zip: a fantastic tool for doing this in a clear and concise way.

As a running example, let us assume we would like to check if 3 numbers $a, b, c$ form a pythagorean triplet, in other words:

$$ a ^ 2 + b ^ 2 = c ^ 2 $$

For the purpose of this example let us start by randomly creating 3 lists of numbers:

In [1]:
import random

random.seed(0)

upper_value = 50
number_of_values = 10 ** 5
list_of_a_values = [random.randint(1, upper_value + 1) for _ in range(number_of_values)]
list_of_b_values = [random.randint(1, upper_value + 1) for _ in range(number_of_values)]
list_of_c_values = [random.randint(1, upper_value + 1) for _ in range(number_of_values)]

Here are our first 5 values of $a$:

In [2]:
list_of_a_values[:5]
Out[2]:
[25, 49, 27, 3, 17]

A common approach

One commone approach would be to create a dummy variable that will keep track of a position that goes through all 3 lists:

In [3]:
pythagorian_triplets = []
for i in range(number_of_values):
    if list_of_a_values[i] ** 2 + list_of_b_values[i] ** 2 == list_of_c_values[i] ** 2:
        pythagorian_triplets.append(
            (
                list_of_a_values[i], 
                list_of_b_values[i], 
                list_of_c_values[i],
            )
        )
In [4]:
pythagorian_triplets
Out[4]:
[(12, 35, 37),
 (20, 21, 29),
 (45, 24, 51),
 (36, 27, 45),
 (36, 15, 39),
 (9, 12, 15),
 (3, 4, 5),
 (18, 24, 30),
 (10, 24, 26),
 (24, 18, 30),
 (10, 24, 26),
 (35, 12, 37),
 (4, 3, 5),
 (30, 16, 34),
 (4, 3, 5),
 (48, 14, 50),
 (30, 40, 50),
 (8, 6, 10),
 (30, 16, 34),
 (28, 21, 35),
 (45, 24, 51),
 (24, 45, 51),
 (18, 24, 30),
 (18, 24, 30),
 (24, 18, 30),
 (9, 12, 15),
 (35, 12, 37),
 (9, 12, 15)]

Zipping our lists together

Another approach is to use python's zip command, this will essentially "zip" the three lists together

In [5]:
pythagorian_triplets = []
for item in zip(list_of_a_values, list_of_b_values, list_of_c_values):
    if item[0] ** 2 + item[1] ** 2 == item[2] ** 2:
        pythagorian_triplets.append(item)
pythagorian_triplets
Out[5]:
[(12, 35, 37),
 (20, 21, 29),
 (45, 24, 51),
 (36, 27, 45),
 (36, 15, 39),
 (9, 12, 15),
 (3, 4, 5),
 (18, 24, 30),
 (10, 24, 26),
 (24, 18, 30),
 (10, 24, 26),
 (35, 12, 37),
 (4, 3, 5),
 (30, 16, 34),
 (4, 3, 5),
 (48, 14, 50),
 (30, 40, 50),
 (8, 6, 10),
 (30, 16, 34),
 (28, 21, 35),
 (45, 24, 51),
 (24, 45, 51),
 (18, 24, 30),
 (18, 24, 30),
 (24, 18, 30),
 (9, 12, 15),
 (35, 12, 37),
 (9, 12, 15)]

What zip is doing here is taking a list of inputs, in this case list_of_a_values, list_of_b_values, list_of_c_values (it can take any number of inputs) and returning a python object that can be iterated over. Each iteration takes an element from each of the inputs.

We can use python unpacking and list comprehensions to write this in a more compact way:

In [6]:
pythagorian_triplets = [(a, b, c) for a, b, c in zip(list_of_a_values, list_of_b_values, list_of_c_values)
                        if a ** 2 + b ** 2 == c ** 2]
pythagorian_triplets
Out[6]:
[(12, 35, 37),
 (20, 21, 29),
 (45, 24, 51),
 (36, 27, 45),
 (36, 15, 39),
 (9, 12, 15),
 (3, 4, 5),
 (18, 24, 30),
 (10, 24, 26),
 (24, 18, 30),
 (10, 24, 26),
 (35, 12, 37),
 (4, 3, 5),
 (30, 16, 34),
 (4, 3, 5),
 (48, 14, 50),
 (30, 40, 50),
 (8, 6, 10),
 (30, 16, 34),
 (28, 21, 35),
 (45, 24, 51),
 (24, 45, 51),
 (18, 24, 30),
 (18, 24, 30),
 (24, 18, 30),
 (9, 12, 15),
 (35, 12, 37),
 (9, 12, 15)]

Unpacking and zipping

Using numpy we can in fact make this slightly more efficient as we can generate a large 3 by N array of random integers (note that numpy's random number generator does not follow the same seeded path as python's standard lib):

In [7]:
import numpy as np
In [8]:
np.random.seed(0)
random_integers = np.random.randint(1, upper_value, (3, number_of_values))
random_integers
Out[8]:
array([[45, 48,  1, ..., 33, 20, 20],
       [11, 40, 11, ..., 20, 21,  9],
       [31, 21,  4, ...,  2, 49, 30]])

We can now use python unpacking with the * command to pass each row of our array directly to zip without having to name them:

In [9]:
pythagorian_triplets = [(a, b, c) for a, b, c in zip(*random_integers) 
                        if a ** 2 + b ** 2 == c ** 2]
pythagorian_triplets
Out[9]:
[(8, 15, 17),
 (24, 32, 40),
 (36, 27, 45),
 (6, 8, 10),
 (12, 16, 20),
 (21, 28, 35),
 (20, 15, 25),
 (6, 8, 10),
 (32, 24, 40),
 (7, 24, 25),
 (24, 10, 26),
 (35, 12, 37),
 (9, 12, 15),
 (3, 4, 5),
 (27, 36, 45),
 (40, 9, 41),
 (4, 3, 5),
 (9, 40, 41),
 (20, 15, 25),
 (15, 36, 39),
 (24, 32, 40),
 (15, 20, 25),
 (24, 10, 26),
 (15, 8, 17),
 (12, 9, 15),
 (24, 32, 40),
 (35, 12, 37),
 (32, 24, 40),
 (10, 24, 26),
 (9, 40, 41),
 (12, 9, 15),
 (24, 10, 26),
 (36, 27, 45),
 (24, 7, 25)]

zip is one of those super handy python commands that once you've gotten the hang off you never want to be without, enumerate is another such tool.

main.ipynb

A blog about programming (usually scientific python), mathematics (usually game theory) and learning (usually student centred pedagogic approaches).

Source code: drvinceknight
Twitter: @drvinceknight
Powered by: