All About the Hessian Matrix, Convexity, and Optimization

Cadence CFD Solutions

Key Takeaways

The Hessian matrix is a mathematical structure that deals with second-order derivatives.
The Hessian matrix will always be a square matrix with a dimension equal to the number of variables of the function.
If the Hessian matrix is positive semi-definite at all points on set A, then the function is convex on set A.

Hessian matrices are used in various computational algorithms for optimization

Optimization is required to effectively solve engineering problems, and most optimization problems in engineering are non-convex. To solve non-convex optimization problems, DC functions are used. The optimization of such non-convex problems is collectively called DC optimization, and the theory of such problems is referred to as DC programming.

Most algorithms utilized in DC programming use the advantage of the Hessian matrix and its convexity-determining functionality to iterate to a better solution. In this article, we will discuss the fundamentals of the Hessian matrix, which is important when it comes to optimized problem-solving using computational algorithms.

The Hessian Matrix

Hessian matrices are extensively used in engineering problem-solving. They are employed to optimize the functions that represent a system. The Hessian matrix is a class of mathematical structures that deal with second-order derivatives. Consider a function f consisting of n variables. The matrix giving the second-order partial derivatives of this function form the Hessian matrix of the given function. The Hessian matrix of function f can be represented by the equation below:

The Order of a Hessian Matrix

From the Hessian matrix given above, it can be concluded that it will always be a square matrix with a dimension equal to the number of variables of the function. For an ‘n’ variable function, the Hessian matrix will be of order n*n.

The Symmetry of a Hessian Matrix

The Hessian matrix for a function of 2 variables is given below.

Hessian matrix for 2 variable functions

In the above Hessian matrix, you can see that element fxy is repeated twice as the first-row second element and second-row first element, respectively. According to Schwarz’s theorem or Clairaut’s theorem, the order of differentiation does not matter in partial differentiation, so the elements are the same even though the function is differentiated with respect to x and y in different orders.

The condition Hij=Hji applies to all Hessian matrices of any order, where i and j denote the row and column numbers, respectively. Whenever the elements in a square matrix obey the condition Hij=Hji, it forms a symmetric matrix. From the discussion so far, it can be concluded that the Hessian matrix is a square matrix that satisfies the symmetry condition. Hence all Hessian matrices are symmetric matrices.

Hessian Matrix vs. Jacobian Matrix

A Hessian matrix consists of the second-order partial derivatives formed from all pairs of variables that the function is dependent on. A Jacobian matrix is also a matrix based on the partial differentiation of the function, but first-order partial derivatives. The Jacobian matrix of a function is given below.

A Jacobian matrix can be used to determine the invertibility of a function. When the determinant of the Jacobian matrix is non-zero, the matrix can be inverted. If the determinant of the Jacobian matrix is equal to zero, the function can be inverted or not.

With the help of the Jacobian matrix, the critical points in a multivariable function can be calculated. However, to classify the critical points into minimums, maximums, and saddle points, a Hessian matrix is required. Let’s see how a Hessian matrix helps in finding the maximums and minimums.

Minimum, Maximum, and the Saddle Point

A Jacobian matrix gives the gradient of a function. When the gradient of a function f is zero at some point x, the function has a critical point at x. To determine whether the critical point is a local minimum, maximum, or saddle point, a Hessian matrix can be utilized.

If the Hessian matrix is positive definite, the critical point corresponds to the local minimum of the function. If the Hessian matrix is negative definite, the critical point corresponds to the local maximum of the function.

If a Hessian matrix is indefinite, the critical point corresponds to the saddle point. Similarly, the Hessian matrix can be utilized for identifying the convexity and concavity of a function.

Using a Hessian Matrix for Convexity Determination of a Function

For a function f whose second derivatives are continuous, the Hessian matrix can be used for determining its convexity and concavity. If the Hessian matrix is positive semi-definite at all points on set A, then the function is convex on set A. The function is strictly convex if the Hessian matrix is positive definite at all points on set A.

The knowledge of first derivatives, Hessian matrix, convexity, etc. is essential for employing gradient-based algorithms to obtain optimized solutions to engineering problems. In most computational software, gradient-based algorithms such as sequential quadratic programming, the limited memory Broyden-Fletcher-Goldfarb-Shanno method, Levenberg-Marquardt, and Gauss-Newton are employed for achieving optimization.

Cadence’s tools can assist you in solving optimization problems in highly complex engineering systems. Subscribe to our newsletter for the latest CFD updates or browse Cadence’s suite of CFD software, including Fidelity and Fidelity Pointwise, to learn more about how Cadence has the solution for you.

CFD Software Subscribe to Our Newsletter