DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: What is the gradient of a scalar function defined on a subspace of square matrices?

Journal Article · · Indian Journal of Pure and Applied Mathematics

We illustrate a technique to calculate the gradient of scalar functions that are defined on any arbitrary matrix subspace. It generalizes our earlier work titled “What is the gradient of a scalar function of a symmetric matrix ?”(Indian Journal of Pure and Applied Mathematics (2022), https://doi.org/10.1007/s13226-022-00313-x), in which we considered the special case of the subspace of symmetric matrices. Extant methods to calculate the gradient in such cases have an inherent flaw which leads to spurious results that populate several publications, as well as respected textbooks and handbooks on matrix calculus. Here, we examine these sources and results in a rigorous and concrete mathematical setting of a finite-dimensional inner-product space and discover the inherent flaw and also a remedy. We demonstrate two ways to calculate the derivative/gradient and second derivative for scalar functions of matrices defined over an arbitrary matrix subspace; the first method is by considering any (differentiable) extension to the space of square matrices and projection of its gradient onto the given subspace. The second method utilizes an ordered basis and computes each component of the gradient through evaluation of the directional derivative. All the ideas presented are illustrated by non-trivial examples, namely, considering the subspace of 3 x 3 circulant and Toeplitz matrices and presenting the results of gradient-descent with both the spurious and correct gradients. Moreover, our bibliography makes it clear that a rigorous approach to matrix calculus is not common in practice, and our presentation of matrix calculus in the language of inner-product spaces will be significant and meaningful for applied mathematicians, engineers and researchers working in inter-disciplinary fields to avoid the conceptual pitfalls that exist.

Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program
Grant/Contract Number:
89233218CNA000001
OSTI ID:
2426769
Report Number(s):
LA-UR--23-23471
Journal Information:
Indian Journal of Pure and Applied Mathematics, Journal Name: Indian Journal of Pure and Applied Mathematics; ISSN 0975-7465; ISSN 0019-5588
Country of Publication:
United States
Language:
English

References (29)

Maximum likelihood solution to factor analysis when some factors are completely specified journal June 1971
Structured Matrices and Their Application in Neural Networks: A Survey journal July 2023
What is the gradient of a scalar function of a symmetric matrix? journal August 2022
Tensor products and matrix differential calculus journal June 1985
On some pattern-reduction matrices which appear in statistics journal June 1985
About the concept of the matrix derivative journal November 1992
The matrix minimum principle journal November 1967
On the concept of matrix derivative journal October 2010
On the use of coordinate-free matrix calculus journal January 2015
Matrix differential calculus with applications in the multivariate linear model and its diagnostics journal March 2022
On Kronecker products, tensor products and matrix differential calculus journal November 2013
Optimization of feedback systems with constrained information flow journal January 1981
Some Applications of Matrix Derivatives in Multivariate Analysis journal June 1967
On Circulant Matrices journal March 2012
Finding patterned complex-valued matrix derivatives by using manifolds conference October 2008
Estimation of structured covariance matrices journal January 1982
Patterned complex-valued matrix derivatives conference July 2008
Derivative operations on matrices journal April 1970
On calculating gradient matrices journal August 1976
The gradient with respect to a symmetric matrix journal April 1977
On symmetric matrices and the matrix minimum principle journal December 1977
Downlink MIMO-RSMA With Successive Null-Space Precoding journal November 2022
Matrix Calculus Operations and Taylor Expansions journal April 1973
Symbolic Matrix Derivatives journal December 1948
Toeplitz and Circulant Matrices: A Review journal January 2005
Multivariate Maxima and Minima with Matrix Derivatives journal December 1969
Patterned matrix derivatives journal December 1988
Vec and vech operators for matrices, with some uses in jacobians and multivariate statistics journal January 1979
The Influence Function of Graphical Lasso Estimators preprint January 2022