NOTES ON STATISTICS, PROBABILITY and MATHEMATICS


LU Decomposition:


require(Matrix)
## Loading required package: Matrix
set.seed(0)

m = matrix(rnorm(9,6,6), nrow=3) 
(aTa = t(m) %*% m)
##          [,1]      [,2]      [,3]
## [1,] 396.1040 174.15240 106.31543
## [2,] 174.1524 268.44089  22.43549
## [3,] 106.3154  22.43549  53.67675

We need PSD matrix to eventually be able to do the Cholesky decomposition - Our “workhorse” matrix will be, from here on, aTa.

But first, the LU Decomposition will be:

(U = expand(lu(aTa))$U)
## 3 x 3 Matrix of class "dtrMatrix"
##      [,1]      [,2]      [,3]     
## [1,] 396.10397 174.15240 106.31543
## [2,]         . 191.87247 -24.30751
## [3,]         .         .  22.06197
(L = expand(lu(aTa))$L)
## 3 x 3 Matrix of class "dtrMatrix" (unitriangular)
##      [,1]       [,2]       [,3]      
## [1,]  1.0000000          .          .
## [2,]  0.4396633  1.0000000          .
## [3,]  0.2684028 -0.1266858  1.0000000
L%*%U
## 3 x 3 Matrix of class "dgeMatrix"
##          [,1]      [,2]      [,3]
## [1,] 396.1040 174.15240 106.31543
## [2,] 174.1524 268.44089  22.43549
## [3,] 106.3154  22.43549  53.67675

LDL Decomposition:


In LDL’ decomposition we separate the diagonal of \(U\) from \(U\). To make it work \(U\) will have to be divided by its diagonal:

(D = (diag(diag(U))))
##         [,1]     [,2]     [,3]
## [1,] 396.104   0.0000  0.00000
## [2,]   0.000 191.8725  0.00000
## [3,]   0.000   0.0000 22.06197
(Udiv = diag(1/diag(U)) %*% U)
## 3 x 3 Matrix of class "dgeMatrix"
##      [,1]      [,2]       [,3]
## [1,]    1 0.4396633  0.2684028
## [2,]    0 1.0000000 -0.1266858
## [3,]    0 0.0000000  1.0000000
L %*% (D %*% Udiv)
## 3 x 3 Matrix of class "dgeMatrix"
##          [,1]      [,2]      [,3]
## [1,] 396.1040 174.15240 106.31543
## [2,] 174.1524 268.44089  22.43549
## [3,] 106.3154  22.43549  53.67675
#compare to:
aTa
##          [,1]      [,2]      [,3]
## [1,] 396.1040 174.15240 106.31543
## [2,] 174.1524 268.44089  22.43549
## [3,] 106.3154  22.43549  53.67675

CHOLESKY Decomposition:


We start with the \({\bf LDU} = \begin{bmatrix} \text{lower triangular L}\\ \text{1s on diagonal} \end{bmatrix} \begin{bmatrix} \text{pivot matrix}\\ \text{D is diagonal} \end{bmatrix} \begin{bmatrix} \text{upper triangular U}\\ \text{1s on diagonal} \end{bmatrix}\)

Requirements: No row exchanges.

If \({\bf X}\) is symmetric, then \({\bf A = LDL^T}\).

Going further back, the \({\bf LU}\) is asymmetric in that \({\bf U}\) has the pivots on its diagonal, while \({\bf L}\) has \(1\)’s. To address this we can divide \(U\) by the pivots that will now be placed in a diagonal:

\[U=\begin{bmatrix}d_1&&&\\&d_2&&\\&&\ddots&\\&&&d_n \end{bmatrix} \begin{bmatrix}1&u_{12}/d_1 & u_{13}/d_1 & \cdots & u_{1n}/d_1\\ &1&u_{23}/d_2&\cdots&u_{2n}/d_2\\&&&\ddots\\&&&&1 \end{bmatrix}\]

To perform the Cholesky decomposition, the matrix has to be symmetric positive semidefinite, so that there are no negative values on the diagonal, because we will get square roots of the diagonal elements. All the pivots have to be positive.

So we take the square root:

\[\begin{bmatrix}\color{red}{\sqrt{d_1}}\sqrt{d_1}&&&\\&\color{red}{\sqrt{d_2}}\sqrt{d_2}&&\\&&\ddots&\\&&&\color{red}{\sqrt{d_n}} \sqrt{d_n}\end{bmatrix}\]

Now the red square roots will multiply the columns of \({\bf L}\), while the black square roots will multiply the rows of \({\bf U}\).

We’ll end up with a form \({\bf LL^T}\). It is in a way taking the square root of a matrix.


So to get the Cholesky we need to sqrt the diagonal, and multiply both the columns of \(L\) and the rows of Udiv:

(sqrtD = sqrt(D))
##          [,1]    [,2]     [,3]
## [1,] 19.90236  0.0000 0.000000
## [2,]  0.00000 13.8518 0.000000
## [3,]  0.00000  0.0000 4.697018
(Lmod = L %*% sqrtD)
## 3 x 3 Matrix of class "dgeMatrix"
##           [,1]      [,2]     [,3]
## [1,] 19.902361  0.000000 0.000000
## [2,]  8.750339 13.851804 0.000000
## [3,]  5.341850 -1.754826 4.697018
(Umod = sqrtD %*% Udiv)
## 3 x 3 Matrix of class "dgeMatrix"
##          [,1]      [,2]      [,3]
## [1,] 19.90236  8.750339  5.341850
## [2,]  0.00000 13.851804 -1.754826
## [3,]  0.00000  0.000000  4.697018

This is a good decomposition. It’s the Cholesky:

Lmod %*% Umod
## 3 x 3 Matrix of class "dgeMatrix"
##          [,1]      [,2]      [,3]
## [1,] 396.1040 174.15240 106.31543
## [2,] 174.1524 268.44089  22.43549
## [3,] 106.3154  22.43549  53.67675
# returns aTa:

aTa
##          [,1]      [,2]      [,3]
## [1,] 396.1040 174.15240 106.31543
## [2,] 174.1524 268.44089  22.43549
## [3,] 106.3154  22.43549  53.67675
(Chol = Umod)
## 3 x 3 Matrix of class "dgeMatrix"
##          [,1]      [,2]      [,3]
## [1,] 19.90236  8.750339  5.341850
## [2,]  0.00000 13.851804 -1.754826
## [3,]  0.00000  0.000000  4.697018
chol(aTa)
##          [,1]      [,2]      [,3]
## [1,] 19.90236  8.750339  5.341850
## [2,]  0.00000 13.851804 -1.754826
## [3,]  0.00000  0.000000  4.697018

Home Page