Further information#

What is the difference between an array and other Python iterables?#

Other iterables seen in this book so far include python lists and tuples. As described in What is the difference between a Python list and a Python tuple? the difference between a list and a tuple is that a tuple is immutable: so that it cannot be changed in place whereas a list can.

An array, in general in other languages, and in particular a numpy array is mutable but one of the main differences is that all types of variables inside the array must be the same. This is not the case for lists or arrays. As an example consider the following collections of two types of variables (a str and an int):

collection_as_tuple = ("dog", 3)
collection_as_tuple
('dog', 3)
collection_as_list = ["dog", 3]
collection_as_list
['dog', 3]
import numpy as np

collection_as_array = np.array(("dog", 3))
collection_as_array
array(['dog', '3'], dtype='<U21')

This has in fact changed the integer value and stored it as a string.

Why is Numpy computationally efficient?#

Python is essentially two things:

  • A language;

  • An interpreter that takes the language and carries out the instructions.

That Interpreter is written in another programming language: specifically the programming language [C](https://en.wikipedia.org/wiki/C_(programming_language). C is a compiled language which means that there is an extra step to running C code: after it has been written, it has to be compiled to something that the computer can understand which is when it is run.

Python however is not a compiled language: it is an interpreted language. This makes it faster to write and debug but slower for to run.

Numpy has a number of vectorized functionalities that essentially let Python speak more directly to C. Which is why it is fast.

Some example of this include using the numpy.max function over the standard library max function.

import numpy as np

np.random.seed(0)
big_array = np.random.random(10 ** 7)
%timeit np.max(big_array)
3.46 ms ± 37 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
import numpy as np

np.random.seed(0)
big_array = np.random.random(10 ** 7)
%timeit max(big_array)
428 ms ± 3.58 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Some further information on this includes:

Where can I find more information on Numpy?#

numpy is a fundamental building block to the scientific python ecosystem. There is are a lot of resources available: