A delivery company delivers fragile items. If a delivery is on time it is usually because it was rushed. The probability that an item is delivered on time is $0.75$. The probability that an item is broken given that it arrived on time is $0.3$ and if it is late $0.2$.

What is the probability that an item is late?
Given that an item is broken what is the probability that it was on time?

We will start by writing some code to simulate a single delivery.

In [1]:
import random

In [2]:
random.seed(0)
random.random()

0.8444218515250481

In [3]:
def is_delivery_late():
    """
    Returns a boolean indicating if a given delivery is late or not.
    """
    return random.random() > .75

In [5]:
is_delivery_late??

[0;31mSignature:[0m [0mis_delivery_late[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mSource:[0m   
[0;32mdef[0m [0mis_delivery_late[0m[0;34m([0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34m"""[0m
[0;34m    Returns a boolean indicating if a given delivery is late or not.[0m
[0;34m    """[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0mrandom[0m[0;34m.[0m[0mrandom[0m[0;34m([0m[0;34m)[0m [0;34m>[0m [0;36m.75[0m[0;34m[0m[0;34m[0m[0m
[0;31mFile:[0m      /var/folders/2p/gzkw2x1n7vb7g7p_c4d43vbw0000gp/T/ipykernel_20257/2667376045.py
[0;31mType:[0m      function

Now that we have the ability to simulate a delivery. Let us create a large number of deliveries:

In [14]:
number_of_repetitions = 10_000
samples = [is_delivery_late() for repetition in range(number_of_repetitions)]

In [15]:
len(samples)

10000

In [16]:
type(samples)

list

We can now estimate the probability of a delivery being late:

In [17]:
sum(samples) / number_of_repetitions

0.253

This can be confirmed theoretically. If the probability of being late is $.75$ then the probability of being on time is $1 - .75=.25$

Now to write some code to simulate the entire experiment: not just if it was late but also if it was broken.

In [18]:
def sample_experiment():
    """
    This samples a delivery and depending on whether or not it is late
    selects whether or not the item is broken.
    """
    is_late = is_delivery_late()

    if is_late is True:
        probability_of_broken = .2
    else:
        probability_of_broken = .3

    is_broken = random.random() < probability_of_broken
    return is_late, is_broken

In [22]:
sample_experiment()

(True, False)

Let us now repeat the experiments:

In [23]:
samples = [sample_experiment() for repetition in range(number_of_repetitions)]

In [43]:
len(samples)

10000

Let us now select the deliveries that were broken from all our deliveries:

In [44]:
deliveries_that_were_broken = [
    (is_late, is_broken) 
    for (is_late, is_broken) in samples
    if is_broken is True
]

In [45]:
len(deliveries_that_were_broken)

2746

Out of those broken deliveries, how many were late:

In [48]:
number_of_broken_deliveries_that_were_late = sum(
    is_late 
    for (is_late, is_broken) in deliveries_that_were_broken
)
number_of_broken_deliveries_that_were_late

513

Thus the probability of an item being late **given** that it is broken is:

In [49]:
number_of_broken_deliveries_that_were_late / len(deliveries_that_were_broken)

0.18681718863801894

We can confirm this using Bayes' thereom:

$$
P(\text{On time} |\text{Broken})P(\text{Broken})= P(\text{Broken}|\text{On time})P(\text{On time})
$$

we know:

$$
P(\text{Broken}|\text{On time})=.3
$$

$$
P(\text{On time})=.75
$$

and can calculate:

$$
P(\text{Broken}) = .3\times .75 + .2\times.25
$$

In [50]:
probability_of_on_time_if_broken = .3 * .75 / (.3 * .75 + .2 * .25)
probability_of_on_time_if_broken

0.8181818181818182

Thus, the probability of being late if broken is:

In [51]:
1 - probability_of_on_time_if_broken

0.18181818181818177