Another service from Omega

The Test of Second Partials


*****

do you remember how to check for the presence of a local max or local min at a point where the derivative is zero?

It went like this: if f'(a) = 0 then you need to check the sign of f"(a). If f"(a) < 0 then it is a local max, if f"(a)>0 it is a local min, and if f"(a)=0, then there is an inflection point at a. This was in calc1, in calc3 the situation is similar but unfortunately more complicated here is the proposition,

Theorem:

Let f(x,y) be a function with continuous second order partial derivatives at the point (a,b). If the gradient at (a,b) is zero, i.e. both partials w.r.t. x and w.r.t. y are zero at (a,b) then there could be a local max, a local min, a saddle point or we can't tell, depending on the sign of the matrix of second partials of f at (a,b). If the sign is positive there is a local min, if negative there is a local max and if it is zero there is a saddle point, otherwise there may be a saddle point or the test is inconclusive depending on wether the determinant of the matrix of second partials at (a,b) is negative or zero.

In order to understand the statement above we need to define what we mean by the sign of a matrix. Let's do this for a 2 by 2 matrix,

> A := matrix(2,2,[fxx,fxy,fyx,fyy]);

                                    [fxx    fxy]
                               A := [          ]
                                    [fyx    fyy]

we say that A is positive (or positive definite) if,

> vector([a,b]) &* evalm(A) &* matrix(2,1,[a,b]) > 0;

                                     [fxx    fxy]    [a]
                       0 < [a, b] &* [          ] &* [ ]
                                     [fyx    fyy]    [b]

and the inequality is true for all possible values of a and b as long as they are not both zero. Notice that this simplifies to,

> collect(evalm(vector([a,b]) &* evalm(A) &* matrix(2,1,[a,b]))[1],[a,b]) > 0;

                              2                      2
                     0 < fxx a  + (fyx + fxy) b a + b  fyy

if the matrix A is symmetric (i.e. when fyx = fxy) this simplifies even more to,

> subs(fyx=fxy,");

                                 2                2
                        0 < fxx a  + 2 fxy b a + b  fyy

Now notice that by factorizing out fxx and completing the square we get the rigth hand side of the above inequality to be:

> fxx*( (a + (fxy/fxx)*b)^2 - ((fxy/fxx)*b)^2 + (fyy/fxx)*b^2);

                         /                  2  2        2\
                         |/    fxy b\2   fxy  b    fyy b |
                     fxx ||a + -----|  - ------- + ------|
                         |\     fxx /        2      fxx  |
                         \                fxx            /

and the two last terms can be re-order as,

> R := fxx*( (a + (fxy/fxx)*b)^2 + (1/fxx^2)*(fxx*fyy-fxy^2)*b^2);

                          /                             2   2\
                          |/    fxy b\2   (fxx fyy - fxy ) b |
                 R := fxx ||a + -----|  + -------------------|
                          |\     fxx /              2        |
                          \                      fxx         /

so if we call DET the determinant of the matrix A (see above) when fxy=fyx,

> DET := fxx*fyy - fxy^2;

                                                 2
                             DET := fxx fyy - fxy

we can now see that the sign of R is controlled by the sign of fxx, provided that DET > 0. This will be useful later.

Let us now prove the theorem above,

Proof: (second partials test)

Notice that f(x,y) has a local max at (a,b) if and only if g(t)=f(x(t),y(t)) has a local max at t=0, where r(t)=(x(t),y(t),g(t)) is a smooth curve on the surface z=f(x,y), passing through (a,b,f(a,b)) at t=0. In other words (a,b,f(a,b)) is at the top of a hill on the surface z=f(x,y) when and only when (a,b,f(a,b)) is a maximum for any path we take through that point. This trick pushes the problem of finding the max of a function of several variables back to calc one since now all we need to do is to check for the max of functions g(t) which are real valued functions of one real value variable t.

As we know from calc1 g(t) will have a local max at t=0 provided that, g'(0)=0 and g"(0)<0. From the chain rule we get,

> Dg := diff(f(x(t),y(t)),t);

                                /d      \                       /d      \
      Dg := D[1](f)(x(t), y(t)) |-- x(t)| + D[2](f)(x(t), y(t)) |-- y(t)|
                                \dt     /                       \dt     /

where D[i](f), is maple's notation for the partial derivative of f w.r.t. the ith variable. Here, i=1 means w.r.t. x and i=2 w.r.t. y. If we denote by fx,fy,fxx,fxy,fyy the first and second order partial derivatives of f at (a,b) then the above expresion when t=0 simplifies to,

> Dg0 := fx*u + fy*v;

                              Dg0 := fx u + fy v

where x'(0)=u and y'(0)=v. The only way in which g'(0)=0 FOR ALL paths is that Dg0=0 for all (u,v)'s in particular when (u,v)=(fx,fy) we must have |(fx,fy)| = 0 and thus, fx=fy=0. Let us now turn to the computation of g"(0). Taking the derivative of Dg w.r.t. t we get,

> D2g := diff(Dg,t);

       /                       /d      \                          /d      \\
D2g := |D[1, 1](f)(x(t), y(t)) |-- x(t)| + D[1, 2](f)(x(t), y(t)) |-- y(t)||
       \                       \dt     /                          \dt     //

                                    / 2      \
    /d      \                       |d       |
    |-- x(t)| + D[1](f)(x(t), y(t)) |--- x(t)| +
    \dt     /                       |  2     |
                                    \dt      /

    /                       /d      \                          /d      \\
    |D[1, 2](f)(x(t), y(t)) |-- x(t)| + D[2, 2](f)(x(t), y(t)) |-- y(t)||
    \                       \dt     /                          \dt     //

                                    / 2      \
    /d      \                       |d       |
    |-- y(t)| + D[2](f)(x(t), y(t)) |--- y(t)|
    \dt     /                       |  2     |
                                    \dt      /

this looks complicated but it is nothing but the product rule and the chain rule applied to the expression Dg. Now when t=0, the above expression simplifies (with our notation) to:

> D2g0 := (fxx*u + fxy*v)*u + (fxy*u+fyy*v)*v;

                 D2g0 := (fxx u + fxy v) u + (fxy u + fyy v) v

hey, what happened to the other two terms with the second derivatives of x(t) and y(t)? well... recall that t=0 and thus fx=fy=0 so the two terms with the second derivatives are just 0. Notice also that since we are assuming that the second order partials are continuous at (a,b) then fxy=fyx. Moreover, D2g0 is nothing but the expression R (above) since,

> sort(expand(D2g0),[u,v]);

                               2                    2
                          fxx u  + 2 fxy u v + fyy v

Hence, the sign of the above expression controls wether we are at a local max, local min or a saddle point on the surface z=f(x,y). This expression is nothing but R and thus, its sign is by definition the sign of the matrix of second derivatives A (see above). So if you understand sign of second derivative to mean the sign of the Hessian (by the way this is the usual name of the matrix of second derivatives of a function of several variables) then the calc1 theorem reads the same as the calc3 theorem!. But in the calc3 version there is the possibility that the Hessian is INDEFINITE i.e. not positive not negative, nor zero! When the Hessian is indefinite then there are two possibilites either DET<0 in which case we are in the pressence of a saddle point or DET=0 and we can't use this theorem to find out local max, mins or saddle points.

We can summarized what we have shown above in the following useful,

Corollary: (Second Partials Test Proper)

If f(x,y) has continuous second order partials in a disk centered at (a,b) and grad(f)(a,b)=0. Then,
There is a LOCAL MAX at (a,b) if,
DET > 0 and fxx < 0.
There is a LOCAL MIN at (a,b) if,
DET > 0 and fxx < 0.
There is a SADDLE POINT at (a,b) if,
DET < 0
The test is INCLONCLUSIVE if,
DET = 0.


Link to the commands in this file
Carlos Rodriguez <carlos@math.albany.edu>
Last modified: Fri Nov 7 12:31:00 EST 1997