Skip to article frontmatterSkip to article content

Appendix: Karush-Kuhn-Tucker Conditions

Motivating Example: Optimising doughnut distribution at a research retreat

At the annual Mathematical Modelling Retreat, a catering mix-up leaves the organisers with a limited supply of 12 gourmet donuts and 8 litres of coffee.

To ensure everyone gets a share, the organisers plan to slice the donuts into small portions and pour the coffee into individual cups of varying sizes.

The participants are divided into two groups:

Let:

The applied group receives the remaining quantities:

The total enjoyment of the retreat is modelled as the sum of group satisfactions:

f(x1,x2)=log(1+x1)+2x2log(1+12x1)+28x2f(x_1, x_2) = \log(1 + x_1) + 2\sqrt{x_2} \log(1 + 12 - x_1) + 2\sqrt{8 - x_2}

subject to the feasibility constraints:

0x112,0x280 \leq x_1 \leq 12, \quad 0 \leq x_2 \leq 8

The problem is now a continuous convex optimisation problem: although donuts and coffee are discrete in reality, the organisers can approximate an optimal allocation by dividing portions finely, making fractional allocations meaningful.

This is an ideal setting for the Karush-Kuhn-Tucker (KKT) conditions.

In the next section, we will formally introduce these conditions and use them to solve this doughnut-and-coffee dilemma.

Theory

Definition: The Karush-Kuhn-Tucker Conditions

Let xRnx \in \mathbb{R}^n be a vector of decision variables. We consider a general nonlinear programme of the form:

Minimise f(x)subject to gi(x)0,i=1,,mhj(x)=0,j=1,,p\begin{aligned} \text{Minimise } \quad & f(x) \\ \text{subject to } \quad & g_i(x) \leq 0, \quad i = 1, \dots, m \\ & h_j(x) = 0, \quad j = 1, \dots, p \end{aligned}

We seek a point xx^* that satisfies all constraints and minimises f(x)f(x).


Suppose ff, gig_i, and hjh_j are continuously differentiable, and suppose a constraint qualification holds at a local minimum xx^*.

Then there exist Lagrange multipliers λi0\lambda_i \geq 0 and μjR\mu_j \in \mathbb{R} such that the Karush-Kuhn-Tucker (KKT) conditions hold:


These conditions are necessary for local optimality, and under convexity assumptions, they are also sufficient.

Each of the conditions has an important role:

Example: Verifying the KKT conditions for the doughnut problem

Let us verify that x11.243x_1 \approx 1.243 and x26.869x_2 \approx 6.869 satisfy the KKT conditions for the doughnut problem.

The objective function is given by f(x1,x2)f(x_1, x_2). The inequality constraints can be written as:

g1(x)=x10g2(x)=x20g3(x)=x1120g4(x)=x280\begin{align*} g_1(x) & = -x_1 \leq 0 \\ g_2(x) & = -x_2 \leq 0 \\ g_3(x) & = x_1 - 12 \leq 0 \\ g_4(x) & = x_2 - 8 \leq 0 \end{align*}

There are no equality constraints in this problem, so there are no associated Lagrange multipliers μ\mu.

The Lagrangian is:

L(x,λ)=λ1x1+λ2(x112)λ3x2+λ4(x28)+2x2log(13x1)+28x2+log(x1+1)\mathcal{L}(x, \lambda) = -\lambda_{1} x_{1} + \lambda_{2} (x_{1} - 12) -\lambda_{3} x_{2} + \lambda_{4} (x_{2} - 8) + 2 \sqrt{x_{2}} \log(13 - x_{1}) + 2 \sqrt{8 - x_{2}} + \log(x_{1} + 1)

We now check whether x(1.243,6.869)x \approx (1.243, 6.869) satisfies the KKT conditions.

Stationarity:

We compute the partial derivatives of the Lagrangian:

L(x,λ)x1=λ1+λ22x213x1+1x1+1L(x,λ)x2=λ3+λ418x2+log(13x1)x2\begin{align*} \frac{\partial \mathcal{L}(x, \lambda)}{\partial x_1} &= -\lambda_{1} + \lambda_{2} - \frac{2 \sqrt{x_{2}}}{13 - x_{1}} + \frac{1}{x_{1} + 1} \\ \frac{\partial \mathcal{L}(x, \lambda)}{\partial x_2} &= -\lambda_{3} + \lambda_{4} - \frac{1}{\sqrt{8 - x_{2}}} + \frac{\log(13 - x_{1})}{\sqrt{x_{2}}} \end{align*}

Primal feasibility, dual feasibility, and complementary slackness:

Substituting x=(1.243,6.869)x = (1.243, 6.869) into the constraints gives:

g1(x)=1.243<0g2(x)=6.869<0g3(x)=10.757<0g4(x)=1.131<0\begin{align*} g_1(x) &= -1.243 < 0 \\ g_2(x) &= -6.869 < 0 \\ g_3(x) &= -10.757 < 0 \\ g_4(x) &= -1.131 < 0 \end{align*}

Since all inequality constraints are strictly satisfied, complementary slackness implies that λi=0\lambda_i = 0 for all ii. This also satisfies dual feasibility.

Verifying stationarity:

Substituting x=(1.243,6.869)x = (1.243, 6.869) and λi=0\lambda_i = 0 into the stationarity conditions gives:

Lx1x,λ=0=26.869131.243+11.243+10Lx2x,λ=0=186.869+log(131.243)6.8690\begin{align*} \left. \frac{\partial \mathcal{L}}{\partial x_1} \right|_{x, \lambda=0} &= - \frac{2 \sqrt{6.869}}{13 - 1.243} + \frac{1}{1.243 + 1} \approx 0 \\ \left. \frac{\partial \mathcal{L}}{\partial x_2} \right|_{x, \lambda=0} &= - \frac{1}{\sqrt{8 - 6.869}} + \frac{\log(13 - 1.243)}{\sqrt{6.869}} \approx 0 \end{align*}

Both partial derivatives are approximately zero, confirming that stationarity holds.

As f(x1,x2)f(x_1, x_2) is a sum of concave functions being maximised (or equivalently, a convex minimisation problem), the KKT conditions are both necessary and sufficient. We can conclude that x(1.243,6.869)x \approx (1.243, 6.869) is indeed the global optimum.

Exercises

Exercise 1: Understanding KKT components

Consider the following optimisation problem:

Minimisef(x,y)=x2+y2subject tox+y2x0,y0\begin{aligned} \text{Minimise} \quad & f(x, y) = x^2 + y^2 \\ \text{subject to} \quad & x + y \geq 2 \\ & x \geq 0, \quad y \geq 0 \end{aligned}
  1. Rewrite the problem in standard form for KKT analysis (i.e., with all constraints as gi(x)0g_i(x) \leq 0).
  2. Write down the KKT conditions symbolically.
  3. Find all KKT points and identify whether they correspond to a global minimum.

Exercise 2: Symbolic derivation of KKT conditions

Consider the problem:

Maximiselog(x)+log(1x)subject to0.1x0.9\begin{aligned} \text{Maximise} \quad & \log(x) + \log(1 - x) \\ \text{subject to} \quad & 0.1 \leq x \leq 0.9 \end{aligned}
  1. Show that the objective function is concave on the interval (0,1)(0, 1).
  2. Derive the KKT conditions for this problem.
  3. Solve for the optimal value of xx and the associated Lagrange multipliers.

Programming

Solving constrained optimisation problems with Scipy

Scipy has functionality to solve constrained optimisation problems.

import numpy as np
import scipy.optimize

def objective(x):
    x1, x2 = x
    return -(np.log(1 + x1) + 2 * np.sqrt(x2) * np.log(1 + 12 - x1) + 2 * np.sqrt(8 - x2))

def g_1(x):
    return x[0]

def g_2(x):
    return x[1]

def g_3(x):
    return 12 - x[0]

def g_4(x):
    return 8 - x[1]

constraints = [
    {'type': 'ineq', 'fun': g_1},
    {'type': 'ineq', 'fun': g_2},
    {'type': 'ineq', 'fun': g_3},
    {'type': 'ineq', 'fun': g_4},
]

res = scipy.optimize.minimize(objective, [6, 4], constraints=constraints)
res

Notable Research

The Karush-Kuhn-Tucker conditions were first described by Karush in Karush, 1939 but this was unpublished and widely unknown until Kuhn and Tucker presented the conditions at a mathematics symposium Kuhn & Tucker, 1951. Some text book still refer to the conditions as the Kuhn-Tucker conditions but most, rightly, refer to them in a way that rightfully respects all the authors.

Before Kuhn & Tucker, 1951 generalisation of the conditions that holds in more complex cases even when the feasible region is not well behaved was published in John, 1948.

The Karush-Kuhn-Tucker conditions provide a way to check stability or indeed optimality but the collection of methods that finds potential optima is referred to as interior point method Boyd & Vandenberghe, 2004.

Conclusion

The Karush-Kuhn-Tucker (KKT) conditions provide a unified framework to characterise optimal solutions to constrained optimisation problems. They extend the method of Lagrange multipliers to include inequality constraints, introducing complementary slackness and dual feasibility as key components.

Table A1.1 summarises the key concepts introduced in this appendix.

Table A1.1:Summary of the Karush-Kuhn-Tucker conditions

ConditionMeaning
StationarityThe objective’s gradient is balanced by the gradients of active constraints
Primal feasibilityThe solution satisfies all original constraints
Dual feasibilityLagrange multipliers for inequality constraints are nonnegative
Complementary slacknessInactive constraints have zero multipliers; active constraints may have positive multipliers

The KKT conditions are necessary for optimality when constraint qualifications hold, and sufficient for optimality when the problem is convex. In this way, they serve both as a diagnostic tool to verify candidate solutions, and as a theoretical foundation for many modern optimisation algorithms.

References
  1. Karush, W. (1939). Minima of functions of several variables with inequalities as side constraints. M. Sc. Dissertation. Dept. of Mathematics, Univ. of Chicago.
  2. Kuhn, H. W., & Tucker, A. W. (1951). Nonlinear Programming. Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, 481–492.
  3. John, F. (1948). Extremum Problems with Inequalities as Side Conditions [Technical Report]. Courant Institute of Mathematical Sciences, New York University.
  4. Boyd, S., & Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press.