Exploring Perturbation Theory for Markov Chains

2026-04-09

This post illustrates the Markov chain perturbation framework due to Schweitzer [1] and Meyer [2], using a pair of 4-state Markov chains. This is mainly so that I can write down my understanding of perturbation theory so that Harry and I can use it for one of our research projects.

This is all based on the following:

References:

[1] P. J. Schweitzer, "Perturbation theory and finite Markov chains," J. Appl. Probab. 5 (1968), 401--413.

[2] C. D. Meyer, "The role of the group generalized inverse in the theory of finite Markov chains," SIAM Rev. 17 (1975), 443--464.

First some Python libraries:

import numpy as np
import matplotlib.pyplot as plt
np.set_printoptions(precision=6, suppress=True)

Two ergodic Markov chains

A base chain \(T\) on \(S = \{0, 1, 2, 3\}\) and a perturbed chain \(\tilde{T} = T + E\). The perturbation \(E\) has zero row sums so that \(\tilde{T}\) remains stochastic. \(E\) is how different \(\tilde T\) is from \(T\).

T = np.array([
    [0.5, 0.3, 0.1, 0.1],
    [0.2, 0.4, 0.3, 0.1],
    [0.1, 0.2, 0.5, 0.2],
    [0.3, 0.1, 0.1, 0.5],
])

print("Row sums of T:", T.sum(axis=1))
print()
print("T =")
print(T)

Row sums of T: [1. 1. 1. 1.]

T =
[[0.5 0.3 0.1 0.1]
 [0.2 0.4 0.3 0.1]
 [0.1 0.2 0.5 0.2]
 [0.3 0.1 0.1 0.5]]

E = np.array([
    [ 0.00,  0.05, -0.03, -0.02],
    [-0.05,  0.08, -0.02, -0.01],
    [ 0.03, -0.06,  0.05, -0.02],
    [-0.02,  0.03, -0.04,  0.03],
])

print("Row sums of E:", E.sum(axis=1))

T_tilde = T + E
print("All entries of T~ >= 0:", np.all(T_tilde >= 0))
print()
print("T~ =")
print(T_tilde)

Row sums of E: [ 0. -0.  0. -0.]
All entries of T~ >= 0: True

T~ =
[[0.5  0.35 0.07 0.08]
 [0.15 0.48 0.28 0.09]
 [0.13 0.14 0.55 0.18]
 [0.28 0.13 0.06 0.53]]

The stationary distribution \(\pi\) satisfies \(\pi T = \pi\), \(\pi \mathbf{1} = 1\). It is the normalised left eigenvector of \(T\) for eigenvalue \(1\).

def stationary_distribution(M):
    eigenvalues, eigenvectors = np.linalg.eig(M.T)
    idx = np.argmin(np.abs(eigenvalues - 1.0))
    pi = eigenvectors[:, idx].real
    return pi / pi.sum()

pi = stationary_distribution(T)
pi_tilde = stationary_distribution(T_tilde)

print("pi     =", pi)
print("pi~    =", pi_tilde)
print("pi~ - pi =", pi_tilde - pi)
print()
print("Check: pi @ T  =", pi @ T)
print("Check: pi~ @ T~ =", pi_tilde @ T_tilde)

pi     = [0.279412 0.258824 0.252941 0.208824]
pi~    = [0.262317 0.292615 0.249017 0.196051]
pi~ - pi = [-0.017094  0.033792 -0.003924 -0.012773]

Check: pi @ T  = [0.279412 0.258824 0.252941 0.208824]
Check: pi~ @ T~ = [0.262317 0.292615 0.249017 0.196051]

Some important quantities.

These three matrices go back to Meyer [2].

Ergodic projector. \(\Pi = \mathbf{1}\pi\) has every row equal to \(\pi\). It is idempotent (\(\Pi^2 = \Pi\)) and is the \(t \to \infty\) limit of \(T^t\).

Fundamental matrix. \(I - T\) is singular (\(\pi(I-T) = 0\)). Adding \(\Pi\) replaces the zero eigenvalue with \(1\), giving the invertible matrix \(Z = (I - T + \Pi)^{-1}\).

Group inverse. \(A^\# = Z - \Pi\) satisfies \(A^\#\mathbf{1} = 0\), \(\pi A^\# = 0\), and the resolvent identity \((I-T)A^\# = I - \Pi\).

I = np.eye(4)
ones = np.ones((4, 1))
Pi = ones @ pi.reshape(1, -1)

print("Pi =")
print(Pi)
print()
print("Idempotent: Pi^2 = Pi?", np.allclose(Pi @ Pi, Pi))

Pi =
[[0.279412 0.258824 0.252941 0.208824]
 [0.279412 0.258824 0.252941 0.208824]
 [0.279412 0.258824 0.252941 0.208824]
 [0.279412 0.258824 0.252941 0.208824]]

Idempotent: Pi^2 = Pi? True

print("det(I - T) =", np.linalg.det(I - T), " (singular)")
print("rank(I - T) =", np.linalg.matrix_rank(I - T))
print()

Z = np.linalg.inv(I - T + Pi)
print("Z =")
print(Z)

det(I - T) = 6.605826996519675e-19  (singular)
rank(I - T) = 3

Z =
[[ 1.314879  0.117993 -0.215571 -0.217301]
 [-0.155709  1.176817  0.13737  -0.158478]
 [-0.302768 -0.117301  1.372664  0.047405]
 [ 0.138408 -0.234948 -0.333218  1.429758]]

A_sharp = Z - Pi

print("A# =")
print(A_sharp)
print()
print("A# @ 1 =", A_sharp @ np.ones(4))
print("pi @ A# =", pi @ A_sharp)
print()
print("Resolvent identity (I-T)A# = I - Pi?",
      np.allclose((I - T) @ A_sharp, I - Pi))

A# =
[[ 1.035467 -0.14083  -0.468512 -0.426125]
 [-0.435121  0.917993 -0.115571 -0.367301]
 [-0.58218  -0.376125  1.119723 -0.161419]
 [-0.141003 -0.493772 -0.586159  1.220934]]

A# @ 1 = [-0.  0.  0.  0.]
pi @ A# = [ 0. -0.  0. -0.]

Resolvent identity (I-T)A# = I - Pi? True

The exact perturbation identity (Schweitzer [1]; Meyer [2])

For ergodic \(T\) and \(\tilde{T} = T + E\) with stationary distributions \(\pi\) and \(\tilde{\pi}\):

\[\tilde{\pi} - \pi = \tilde{\pi} \, E \, A^\#\]

This is exact — no remainder term. The proof (three lines):

\(\tilde{\pi}(T+E) = \tilde{\pi}\) gives \(\tilde{\pi}(I-T) = \tilde{\pi}E\)
Multiply by \(A^\#\): \(\tilde{\pi}(I - \Pi) = \tilde{\pi}EA^\#\) using the resolvent identity
\(\tilde{\pi}\Pi = \pi\) since \(\tilde{\pi}\mathbf{1} = 1\)

This leads to a first-order approximation \(\tilde{\pi} - \pi \approx \pi E A^\#\) (replacing \(\tilde{\pi}\) with \(\pi\)).

LHS = pi_tilde - pi
RHS = pi_tilde @ E @ A_sharp

print("pi~ - pi  =", LHS)
print("pi~ E A#  =", RHS)
print("Match:", np.allclose(LHS, RHS))

pi~ - pi  = [-0.017094  0.033792 -0.003924 -0.012773]
pi~ E A#  = [-0.017094  0.033792 -0.003924 -0.012773]
Match: True

RHS_approx = pi @ E @ A_sharp

print("Exact (pi~ - pi):", LHS)
print("Approx (pi E A#):", RHS_approx)
print("Error:           ", LHS - RHS_approx)
print()
print(f"||error|| / ||exact|| = {np.abs(LHS - RHS_approx).sum() / np.abs(LHS).sum():.4f}")

Exact (pi~ - pi): [-0.017094  0.033792 -0.003924 -0.012773]
Approx (pi E A#): [-0.014702  0.03192  -0.004801 -0.012418]
Error:            [-0.002393  0.001871  0.000877 -0.000355]

||error|| / ||exact|| = 0.0813

Some code to confirm the steps of the proof:

print("Step 1: pi~(I-T) = pi~E?",
      np.allclose(pi_tilde @ (I - T), pi_tilde @ E))
print("Step 2: pi~(I-Pi) = pi~EA#?",
      np.allclose(pi_tilde @ (I - Pi), pi_tilde @ E @ A_sharp))
print("Step 3: pi~Pi = pi?",
      np.allclose(pi_tilde @ Pi, pi))

Step 1: pi~(I-T) = pi~E? True
Step 2: pi~(I-Pi) = pi~EA#? True
Step 3: pi~Pi = pi? True

Getting a relationship on some observable quantity

Give \(s\), if we want to find \(\tilde \pi s\).

\[ \tilde{\pi} \cdot s = \pi \cdot s + \tilde{\pi} \, E \, A^\# s. \]

The first order approximation gives:

\[ \tilde{\pi} \cdot s \;\approx\; \pi \cdot s + \pi \,E\, A^\# s. \]

s = np.array([0.0, 0.3, 0.7, 1.0])

value = pi_tilde @ s
print("Exact value of observable:", value)
actual_change = (pi_tilde - pi) @ s
print("Exact value of change of observable:", actual_change)
approximate_value = (pi @ s + pi @ E @ A_sharp @ s)
print("Approximate value of observable:", value)
approximate_change = approximate_value - pi @ s
print("Approximate value of change of observable:", actual_change)

Exact value of observable: 0.4581469351563172
Exact value of change of observable: -0.0053824766083887705
Approximate value of observable: 0.4581469351563172
Approximate value of change of observable: -0.0053824766083887705

Using that expression of the pertubated observation might just be what Harry and I need for our current project.