Tutorial
Tutorial#
To demonstrate the use case of Numpy we will use it to explore the mathematics of a Galton Board.
Problem
A Galton Board can be represented using a Markov Chain as shown in this diagram:
Write a function to generate all states of the Markov chain with \(N\) rows:
\[\{(i, j) \in \mathbb{Z} ^ 2 |\;0\leq i \leq N, 0<\leq j \leq i\}\]For a given value of \(p\) which corresponds to the probability of falling left, write a function to generate the transition rate matrix:
\[\begin{split} P_{s^{(1)}, s^{(2)}} = \begin{cases} p &; \text{ if }s^{(2)} - s^{(1)} = (1, 0)\text{ and }{s_1^{(2)}} < N \\ 1 - p &; \text{ if }s^{(2)} - s^{(1)} = (1, 1)\text{ and }{s_1^{(2)}} < N \\ 1 &; \text{ if }s^{(2)} = N\\ 0 &; \text{otherwise} \\ \end{cases} \end{split}\]Find the expected distribution of beans falling through the Galton board given by \(\pi P ^N\) where \(\pi = (1, 0, \dots 0)\) for \(N=3\) and for \(p\) varying between 0 and 1.
Experiment with high values of \(N\) to see how numpy performs on large matrices..
The first function is obtained using similar tools to the chapter on
combinatorics. Note however that we will be using numpy
arrays instead of
tuples to represent the state space.
import numpy as np
def get_all_states(N):
"""
Obtain all states for a Galton board with N rows.
"""
return [np.array((i, j)) for i in range(N + 1) for j in range(i + 1)]
get_all_states(N=3)
[array([0, 0]),
array([1, 0]),
array([1, 1]),
array([2, 0]),
array([2, 1]),
array([2, 2]),
array([3, 0]),
array([3, 1]),
array([3, 2]),
array([3, 3])]
Now we will implement a second function that takes two states and the parameters of the problem to find the transition between two states:
def get_transition_probability(in_state, out_state, N, p):
"""
Given two states, an in_state and an out_state this returns the probability
of going from the in_state to the out_state.
"""
if in_state[0] == N and np.array_equal(in_state, out_state):
return 1
if np.array_equal(out_state - in_state, np.array((1, 0))):
return p
if np.array_equal(out_state - in_state, np.array((1, 1))):
return 1 - p
return 0
in_state = np.array((2, 1))
out_state = np.array((3, 2))
get_transition_probability(in_state=in_state, out_state=out_state, N=5, p=0.3)
0.7
Attention
We compare numpy arrays using numpy.array_equal
.
We can then create the matrix \(P\) by iterating over the state space and finding the probability of transition:
def get_transition_matrix(N, p):
"""
Create the transition matrix for a Galton board with N rows and probability
of falling to the left of p.
This is done by creating an empty matrix of the required size and modifying
the values in place.
"""
states = get_all_states(N)
number_of_states = len(states)
P = np.zeros((number_of_states, number_of_states))
for row, in_state in enumerate(states):
for col, out_state in enumerate(states):
P[row, col] = get_transition_probability(in_state, out_state, N, p)
return P
get_transition_matrix(N=3, p=0.3)
array([[0. , 0.3, 0.7, 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0.3, 0.7, 0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0.3, 0.7, 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. , 0. , 0.3, 0.7, 0. , 0. ],
[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.3, 0.7, 0. ],
[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.3, 0.7],
[0. , 0. , 0. , 0. , 0. , 0. , 1. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. , 0. ],
[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. ]])
Attention
In the above function we are using the base tool enumerate
which is a powerful
Python tool that will iterate over elements in an iterable as well as return the
index of the items.
Now to find the expected distribution.
def get_distribution(N, p):
"""
This returns the distribution of beans over the entire state space after the
all fall.
This is done by computing (pi P ^ N) where pi = (1, 0, \dots 0).
"""
P = get_transition_matrix(N=N, p=p)
number_of_rows = len(P)
pi = np.zeros(number_of_rows)
pi[0] = 1
return pi @ (np.linalg.matrix_power(P, N))
get_distribution(N=3, p=0.5)
array([0. , 0. , 0. , 0. , 0. , 0. , 0.125, 0.375, 0.375,
0.125])
Attention
We calculate the matrix exponentiation using
np.linalg.matrix_power
.
We can finally see how this distribution changes for different values of \(p\):
for p in np.linspace(0, 1, 11):
print(p, get_distribution(N=3, p=p)[-4:])
0.0 [0. 0. 0. 1.]
0.1 [0.001 0.027 0.243 0.729]
0.2 [0.008 0.096 0.384 0.512]
0.30000000000000004 [0.027 0.189 0.441 0.343]
0.4 [0.064 0.288 0.432 0.216]
0.5 [0.125 0.375 0.375 0.125]
0.6000000000000001 [0.216 0.432 0.288 0.064]
0.7000000000000001 [0.343 0.441 0.189 0.027]
0.8 [0.512 0.384 0.096 0.008]
0.9 [0.729 0.243 0.027 0.001]
1.0 [1. 0. 0. 0.]
We see that as \(p\) increases the beans end up more to the left.
We can see that the function runs in a reasonable time for \(N=20\) which corresponds to a matrix of size \(\frac{(N+1)(N+2)}{2}=231\).
%timeit get_distribution(N=20, p=0.5)
453 ms ± 4.52 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Important
In this chapter we have:
Written a function to create a
numpy
array.Used
numpy
to efficiently carry out matrix multiplication.