Lagrange Multipliers I
On the last slide, we found the maximum of a function on the boundary of
a square, and a maximum of a function on the boundary of a disk, using
single-variable methods. This isn't always possible, especially when
our region is 3-dimensional, so that the boundary is a 2-dimensional
surface. In this slide we'll see how to find the maximum or mimimum
of a function on such a curve or surface.
Such so-called constrained optimization problems
come up frequently in applications. Suppose, for instance, you want
To maximize production by allocating a fixed amount of capital between labor and manufacturing costs,
Or to know where a weather balloon at the end of a fixed length of cable ends up,
Or need to pack food for a trip, but you've got
a limited amount of space in your car.
We solve these problems with the method of Lagrange multipliers.
This is a particularly powerful technique
in engineering, the sciences and economics because it works in a wide
variety of problems and in any number of variables. There are two
ways to think about the method. One, based on gradients, is explained in
the following video. The other, based on contour maps, is explained in
the subsequent text.
Let's set up an idealized mathematical model by taking a
'hike in the mountains':
Problem:Find the maximum and minimum values of $$f(x, y)\ = \ y^2 - x^2$$subject to the constraint$$g(x,\,y)\ =\ x^2+y^2-4\ = \ 0\,.$$
We could use single variable methods: simply eliminate $y$ from
$f(x,\, y)$ using $g(x,\,y) = 0$.
However, it will be instructive to investigate the problem using a lot
of what has been learned of late! using a lot of what has been
learned of late! The graph of $x^2+y^2 - 4=0$ is a circle of radius
$2$ centered at the origin in the $xy$-plane. Without restrictions on
$x,\, y$ there would be no maximum or minimum values of $f(x,\,y)=x^2-y^2$,
just the saddle point at the origin!
The constraint places restrictions on the values of $x,\,y$ in
$f(x,\,y)$. Let's explore the effect of this constraint condition
both in $3$-space and via contour maps in the $xy$-plane. In $3$-space
we look at the surface $z=f(x,\,y)$ and think of $z$ as elevation above
sea level.
The trouble is that the graph of $f$ is a surface in
$3$-space, while the graph of $g$ is a circle in the $xy$-plane. But
in $3$-space the graph of $x^2+y^2 - 4 = 0$ is a
circular cylinder. So the constraint $g(x,\,y)= 0$ says we
look only for highest and lowest points on the path
where the cylinder graph intersects the graph of $z = x^2-y^2$
as shown in orange to the left below. At these maximum and minimum
points you are walking horizontally along the contour through
that point - you'd still be going uphill or downhill otherwise! How
can this be seen in the contour map of $z = x^2-y^2$ as shown to the
right below?
Do you see which point on the orange curve
corresponds to the point $P$ on the contour map? What about $Q$ and
$R$? (Don't forget to rotate the surface for different viewpoints.)
The crucial idea is that when the contour line and the circle have a
common tangent at a point $(a,\,b)$ in the right hand graphic, then
the corresponding point on the orange curve will be at a local max and
local minimum on the curve because here you will be
walking horizontally along the contour. But then both gradient
vectors $\nabla f(a,\,b)$ and $\nabla g(a,\,b)$ will be perpendicular
to this common tangent at $(a,\,b)$, hence parallel. Since two
vectors are parallel when one is a scalar multiple of the
other, we thus get:
Method of Lagrange Multipliers:
the maximum and minimum values of $z = f(x, \,y)$ subject to the
constraint $g(x, \,y) = 0$ occur at a point $(a, \,b)$ for which there exists $\lambda$
such that
$$ (\nabla f)(a, \,b) \ =\ \lambda (\nabla g)(a, \,b), \qquad g(a,\, b) \ = \ 0\,,$$
and $(\nabla g)(a, \,b) \ne 0$. Such points $(a,\,b)$ will be called critical points.
The method of Lagrange multipliers works just as well
when $f(x, \,y,\,z)$ and $g(x, \,y, \,z)$ are functions of $3$ variables
(or any greater number of variables for that matter). Since $\nabla f$ and
$\nabla g$ are still well-defined, we can still solve the
equations $\nabla f = \lambda \nabla g$ and $g=0$.