PRACTICAL EXAMPLES AND REVIEW OF TENSORS:

From this question in Math.SE:

Tensor Contraction:

Original Question:

According to Wikipedia if the bases vectors are fixed the two tensors \(A\) and \(B\) result in the tensor product given by the Kronecker multiplication:

\[A = \begin{bmatrix} a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \end{bmatrix}\]

\[B = \begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix}\]

\[\small T = A \otimes B =\small \begin{bmatrix} a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \end{bmatrix} \otimes \begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} \\=\begin{bmatrix} a_{1,1}\begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} & a_{1,2}\begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} \\ a_{2,1}\begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} & a_{2,2}\begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} \end{bmatrix} = \begin{bmatrix} a_{1,1}b_{1,1} & a_{1,1}b_{1,2} & a_{1,2}b_{1,1} & a_{1,2}b_{1,2} \\ a_{1,1}b_{2,1} & a_{1,1}b_{2,2} & a_{1,2}b_{2,1} & a_{1,2}b_{2,2} \\ a_{2,1}b_{1,1} & a_{2,1}b_{1,2} & a_{2,2}b_{1,1} & a_{2,2}b_{1,2} \\ a_{2,1}b_{2,1} & a_{2,1}b_{2,2} & a_{2,2}b_{2,1} & a_{2,2}b_{2,2} \end{bmatrix} \]

I presume that \(T\) would have two upper and two lower indices, as in \(T^{ij}{}_{kl}\).

Now I should be able to contract \(\color{red} i\) and \(\color{red} k\), equivalently of taking the trace of the \(A^i{}_{k}\) matrix and multiplying it by \(B\), i.e. \(\hbox{Tr}(A)B\). This would correspond to \(T^{ij}{}_{il}\).

Given that each matrix \(A\) and \(B\) is \(2 \times 2\), this would have to reduce the last matrix above from its \(4 \times 4\) dimensions.

How would this be implemented in a given numerical calculation if there is no possibility to recover the initial coefficients of \(A\) and \(B\)?

The entries involved appear in the final matrix already multiplied with entries of \(B\):

\[ \begin{bmatrix} \color{red}{a_{1,1}}b_{1,1} & \color{red}{a_{1,1}}b_{1,2} & a_{1,2}b_{1,1} & a_{1,2}b_{1,2} \\ \color{red}{a_{1,1}}b_{2,1} & \color{red}{a_{1,1}}b_{2,2} & a_{1,2}b_{2,1} & a_{1,2}b_{2,2} \\ a_{2,1}b_{1,1} & a_{2,1}b_{1,2} & \color{red}{a_{2,2}}b_{1,1} & \color{red}{a_{2,2}}b_{1,2} \\ a_{2,1}b_{2,1} & a_{2,1}b_{2,2} & \color{red}{a_{2,2}}b_{2,1} & \color{red}{a_{2,2}}b_{2,2} \end{bmatrix} \]


Answer:

The Wikipedia article is slightly confusing by omitting some brackets from the final tensor. It looks like a \(4\times4\) matrix, i.e., a rank \(2\) tensor with \(4^2=16\) entries, but actually it is a \(2\times2\times2\times2\) tensor of rank \(4\) having \(2^4=16\) entries.

\[\small T = A \otimes B = \begin{bmatrix} \begin{bmatrix}a_{{1,1}}b_{1,1} & a_{1,1}b_{1,2}\\ a_{1,1}b_{2,1} & a_{{1,1}}b_{2,2}\end{bmatrix} & \begin{bmatrix}a_{{1,2}}b_{1,1} & a_{1,2}b_{1,2} \\ a_{1,2}b_{2,1} & a_{{1,2}}b_{2,2} \end{bmatrix} \\ \begin{bmatrix}a_{{2,1}}b_{1,1} & a_{2,1}b_{1,2}\\ a_{2,1}b_{2,1} & a_{{2,1}}b_{2,2}\end{bmatrix} & \begin{bmatrix}a_{{2,2}}b_{1,1} & a_{2,2}b_{1,2}\\ a_{2,2}b_{2,1} & a_{{2,2}}b_{2,2}\end{bmatrix} \end{bmatrix} \]

Contraction happens by taking the trace of four \(2\times2\) “submatrices” separately:

\[T^{\color{red}{i}j}{}_{\color{red}{k}l} = \begin{bmatrix} \hbox{Tr}\begin{bmatrix}a_{\color{red}{1,1}}b_{1,1} & a_{1,2}b_{1,1}\\ a_{2,1}b_{1,1} & a_{\color{red}{2,2}}b_{1,1}\end{bmatrix} & \hbox{Tr}\begin{bmatrix}a_{\color{red}{1,1}}b_{1,2} & a_{1,2}b_{1,2} \\ a_{2,1}b_{1,2} & a_{\color{red}{2,2}}b_{1,2} \end{bmatrix} \\ \hbox{Tr}\begin{bmatrix}a_{\color{red}{1,1}}b_{2,1} & a_{1,2}b_{2,1}\\ a_{2,1}b_{2,1} & a_{\color{red}{2,2}}b_{2,1}\end{bmatrix} & \hbox{Tr}\begin{bmatrix}a_{\color{red}{1,1}}b_{2,2} & a_{1,2}b_{2,2}\\ a_{2,1}b_{2,2} & a_{\color{red}{2,2}}b_{2,2}\end{bmatrix} \end{bmatrix} = \begin{bmatrix} a_{\color{red}{1,1}}b_{1,1} + a_{\color{red}{2,2}}b_{1,1} & a_{\color{red}{1,1}}b_{1,2} + a_{\color{red}{2,2}}b_{1,2} \\ a_{\color{red}{1,1}}b_{2,1} + a_{\color{red}{2,2}}b_{2,1} & a_{\color{red}{1,1}}b_{2,2} + a_{\color{red}{2,2}}b_{2,2} \end{bmatrix}\]


Comments:

The matrix in the answer is a different block matrix from the starting one, i.e. \(A \otimes B\):

\[\begin{bmatrix} \begin{bmatrix}a_{\color{red}{1,1}}b_{1,1} & a_{1,2}b_{1,1}\\ a_{2,1}b_{1,1} & a_{\color{red}{2,2}}b_{1,1}\end{bmatrix} & \begin{bmatrix}a_{\color{red}{1,1}}b_{1,2} & a_{1,2}b_{1,2} \\ a_{2,1}b_{1,2} & a_{\color{red}{2,2}}b_{1,2} \end{bmatrix} \\ \begin{bmatrix}a_{\color{red}{1,1}}b_{2,1} & a_{1,2}b_{2,1}\\ a_{2,1}b_{2,1} & a_{\color{red}{2,2}}b_{2,1}\end{bmatrix} & \begin{bmatrix}a_{\color{red}{1,1}}b_{2,2} & a_{1,2}b_{2,2}\\ a_{2,1}b_{2,2} & a_{\color{red}{2,2}}b_{2,2}\end{bmatrix} \end{bmatrix} =\begin{bmatrix} b_{1,1} \begin{bmatrix}a_{\color{red}{1,1}} & a_{1,2}\\ a_{2,1} & a_{\color{red}{2,2}}\end{bmatrix} & b_{1,2}\begin{bmatrix}a_{\color{red}{1,1}} & a_{1,2} \\ a_{2,1} & a_{\color{red}{2,2}} \end{bmatrix} \\ b_{2,1}\begin{bmatrix}a_{\color{red}{1,1}} & a_{1,2}\\ a_{2,1} & a_{\color{red}{2,2}}\end{bmatrix} & b_{2,2}\begin{bmatrix}a_{\color{red}{1,1}} & a_{1,2}\\ a_{2,1} & a_{\color{red}{2,2}}\end{bmatrix} \end{bmatrix} = \begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} \otimes \small \begin{bmatrix} a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \end{bmatrix} \] It actually reverts the order, corresponding to \(B \otimes A\). What would happen if we used the original matrix? We’d be contracting \(\color{red}j\) and \(\color{red}l\):

\[T^{i\color{red}j}{}_{k\color{red}l} = \begin{bmatrix} \hbox{Tr}\begin{bmatrix}b_{\color{red}{1,1}}a_{1,1} & b_{1,2}a_{1,1} \\ b_{2,1}a_{1,1} & b_{\color{red}{2,2}}a_{1,1} \end{bmatrix} & \hbox{Tr} \begin{bmatrix}b_{\color{red}{1,1}}a_{1,2} & b_{1,2}a_{1,2} \\ b_{2,1}a_{1,2} & b_{\color{red}{2,2}}a_{1,2} \end{bmatrix} \\ \hbox{Tr}\begin{bmatrix}b_{\color{red}{1,1}}a_{2,1} & b_{1,2}a_{2,1}\\ b_{2,1}a_{2,1} & b_{\color{red}{2,2}}a_{2,1}\end{bmatrix} & \hbox{Tr}\begin{bmatrix}b_{\color{red}{1,1}}a_{2,2} & b_{1,2}a_{2,2}\\ b_{2,1}a_{2,2} & b_{\color{red}{2,2}}a_{2,2}\end{bmatrix} \end{bmatrix} \leftarrow \begin{bmatrix} a_{1,1}\begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} & a_{1,2}\begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} \\ a_{2,1}\begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} & a_{2,2}\begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \end{bmatrix} \end{bmatrix}\]

What is the system behind rearranging the block matrices?

The system behind rearranging the block matrices is that the grouping (in order to write the tensor as a 4×4 matrix) involves an arbitrary choice of two indices, and the contraction may be over a different pair of indices.


What would be the mechanics if instead I wanted to contract \(j\) and \(l\), or \(i\) and \(l\) in \(T^{ij}{}_{kl}\)?

You would still take traces of 4 submatrices at a time, only different ones.

I have to qualify my above comment that it is always traces of submatrices. That holds if you contract over the two indices of \(A\) or over the two indices of \(B\), but if you contract over, say, \(\color{red}i\) and \(\color{orange}l\) - \(T^{\color{red}{i}j}{}_{k\color{orange}{l}}\) - then you would have a matrix product.


Interesting how all the info from the entries not involved in the contraction is literally thrown out. Tempting to say “no trace of it”. So all that matters is the correct arrangements of the indices involved, which outside the constraints of a matrix, it would not be relevant.

Any contraction throws out information from all components where the two indices involved in the contraction have different values. The ordinary trace of a square matrix is just the simplest example of that.


What about this answer?

Question:

So I’m having trouble to compute tensor contractions with “actual” numbers from the matrix representations of the tensors. I have only seen abstract theoretical examples on the internet so I’m asking for a bit of help on how to find the contractions given the expressions of the tensors and the pair of indices where we will carry out the contraction. I’ll show a simple example and I hope you can help me.

Let’s suppose we have two tensors of the type (1,1) (that means 1 contravariant, 1 covariant). They will be called Y and Z and knowing their coordinate forms, we can represent them through matrices in this way:

Y = \[\begin{pmatrix}1&-1\\2&3\end{pmatrix}\]

Z = \[\begin{pmatrix}-1&0\\1&2\end{pmatrix}\]

Now, we could compute the Kronecker product Y x Z in order to get a type (2,2) tensor (2 contravariant, 2 covariant). The result would be:

\[\begin{pmatrix}-1&0&1&0\\1&2&-1&-2\\-2&0&-3&0\\2&4&3&6\end{pmatrix}\]

So, how could we get the contractions of this (2,2) tensor (with actual numbers instead of the parameters that you see in other examples)?

Note that here we have 2 contravariant indices and 2 covariant indices so there are 4 possible contractions depending on the pair of indices that you choose, I guess. As far as I know, all possible contractions with the selection of pairs of indices are the following:

1. 1st covariant index and 1st contravariant index.

2. 1st covariant index and 2nd contravariant index.

3. 2nd covariant index and 1st contravariant index.

4. 2nd covariant index and 2nd contravariant index.

So what matrices would be the result of those contractions. Just in case I didn’t state it clearly before, these matrices are the representation of tensor of which we know the coefficients of their coordinate forms (explicit forms in a specific basis, and that info is known). So we want to perform contractions in the last 4x4 matrix that I showed (which represents a (2,2) tensor) and there a 4 different possibilities. My question has to do with the fact that I can’t find a way to do this with actual numbers and I can’t figure out the result for each possible contraction (I also don’t know what the differences would be in the calculation of a contraction with a certain pair of indices or another).

I would really appreciate that someone could find a specific result for the contractions that I proposed (I guess the calculation is easy but I just don’t know how it could be done). Thank you really much for reading.

Answer:

Representing tensors using matrix notation is often confusing, but let’s assume that

\(Y = \begin{pmatrix}y^1_1&y^1_2\\v^2_1&y^2_2\end{pmatrix}\)

and similarly for Z. If \(W = Y \times Z\) then the components of \(W\) are

\(w^{ik}_{jl} = y^i_j z^k_l\)

You have represented W as a 4x4 matrix, but it would be more accurate to represent it as a 2x2 matrix, each of whose entries is another 2x2 matrix.

Anyway, the four possible contractions of W are:

(1): \(w^{ik}_{il} = y^i_i z^k_l\)

(2): \(w^{ik}_{jk} = y^i_j z^k_k\)

(3): \(w^{ik}_{ji} = y^i_j z^k_i\)

(4): \(w^{ij}_{jl} = y^i_j z^j_l\)

In terms of matrix operations:

(1) is the component representation of \(Tr(Y)Z\)

(2) is the component representation of \(Tr(Z)Y\)

(3) is the component representation of \(YZ\)

(4) is the component representation of \(ZY\)

The answer that you refer to, remains correct. We only appear to take 4 different traces; if you apply the distributive law then we are only taking the trace of \(A\), multiplied by the four components of \(B\) separately. But I wrote it this way because you asked for a computation that did not rely on separate knowledge of \(A\) or \(B\).

Sounds correct:

The matrix of traces in the answer is

\[\begin{bmatrix} a_{\color{red}{1,1}}b_{1,1} + a_{\color{red}{2,2}}b_{1,1} & a_{\color{red}{1,1}}b_{1,2} + a_{\color{red}{2,2}}b_{1,2} \\ a_{\color{red}{1,1}}b_{2,1} + a_{\color{red}{2,2}}b_{2,1} & a_{\color{red}{1,1}}b_{2,2} + a_{\color{red}{2,2}}b_{2,2} \end{bmatrix} = (a_{\color{red}{1,1}} + a_{\color{red}{2,2}}) \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \hbox{Tr}(A) B\]

For \(\small Y = \begin{pmatrix}1&-1\\2&3\end{pmatrix}\) and \(Z = \small \begin{pmatrix}-1&0\\1&2\end{pmatrix}\),

\[\begin{align} a_{11}b_{11} + a_{22}b_{11} &= -1 + (-3) = -4\\ a_{11}b_{12} + a_{22}b_{2} &= 0 + 0 = 0\\ a_{11}b_{21} + a_{22}b_{21} &=1 + 3 = 4\\ a_{11}b_{22} + a_{22}b_{22} &= 2+ 6 = 8 \end{align}\]

are the traces in the answer when applied to \(Y\) and \(Z\), and correspond to \(\text{Tr}(Y)Z=4 \begin{pmatrix}-1&0\\1&2\end{pmatrix}\)


Question:

If I want to contract \(\color{red}i\) and \(\color{red} l\), as in \(T^{\color{red}{i}j}{}_{k\color{red}l}\), what would be the mechanics (algorithm) it if all I have is the final Kronecker block matrix above? (no \(A\) or \(B\) explicitly given).

I am not sure that I understand what contraction would mean in this set up, but since I know it is matrix multiplication, and that contraction means equating two indices, I think it could be calculated in matrix form as follows (not sure it’s correct, please correct as needed):

\[\begin{matrix} j = 1, k = 1: & T^{i1}{}_{1i} = a_{1,1}b_{1,1} + a_{2,1}b_{1,2}\\ j = 1, k = 2: & T^{i1}{}_{2i} = a_{1,1}b_{2,1} + a_{2,1}b_{2,2} \\ j = 2, k = 1: & T^{i2}{}_{1i} = a_{1,2}b_{1,1} + a_{2,2}b_{1,2} \\ j = 2, k = 2: & T^{i2}{}_{2i} = a_{1,2}b_{2,1} + a_{2,2}b_{2,2} \end{matrix} \]

or

\[ \begin{array}{l||r|r} & i = l = 1 & i = l = 2 \\ \hline & a^\color{red}{1}{}_k b^{j}{}_\color{red}{1} & a^\color{red}{2}{}_k b^{l}{}_\color{red}{2} \\ \hline a_\color{orange}{1} b^{\color{orange}{1}} & a_{\color{red}{1}\color{orange}{1}} b_{\color{orange}{1}\color{red}{1}} & a_{\color{red}{2}\color{orange}{1}} b_{\color{orange}{1}\color{red}{2}} \\ a_\color{orange}{1} b^{\color{orange}{2}} & a_{\color{red}{1}\color{orange}{1}} b_{\color{orange}{2}\color{red}{1}} & a_{\color{red}{2}\color{orange}{1}} b_{\color{orange}{2}\color{red}{2}} \\ a_\color{orange}{2} b^{\color{orange}{1}} & a_{\color{red}{1}\color{orange}{2}} b_{\color{orange}{1}\color{red}{1}} & a_{\color{red}{2}\color{orange}{2}} b_{\color{orange}{1}\color{red}{2}} \\ a_\color{orange}{2} b^{\color{orange}{2}} & a_{\color{red}{1}\color{orange}{2}} b_{\color{orange}{2}\color{red}{1}} & a_{\color{red}{2}\color{orange}{2}} b_{\color{orange}{2}\color{red}{2}} \\ \end{array} \]

which could be arranged in a matrix multiplication as:

\[Y^{j}{}_{k}=\begin{bmatrix} a_{11} & a_{21}\\ a_{12} & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} & b_{21}\\ b_{12} & b_{22} \end{bmatrix} =\begin{bmatrix} a_{\color{red}1,1}b_{1,\color{red}1} + a_{\color{red}2,1}b_{1,\color{red}2} & a_{\color{red}1,1}b_{2,\color{red}1} + a_{\color{red}2,1}b_{2,\color{red}2} \\ a_{\color{red}1,2}b_{1,\color{red}1} + a_{\color{red}2,2}b_{1,\color{red}2} & a_{\color{red}1,2}b_{2,\color{red}1} + a_{\color{red}2,2}b_{2,\color{red}2} \end{bmatrix}\]

again reducing the rank from \(4\) to \(2\).

Compare this to the permutations in the indices when contracting \(i\) and \(k\) in

\(T^{\color{red}{i}j}{}_{\color{red}{k}l}\):

\[\begin{array}{l||r|r} &i = k = 1 & i = k = 2\\ \hline &a^{\color{red}{1}}{}_{\color{red}1} b^{j}{}_{l} & a^{\color{red}{2}}{}_{\color{red}2} b^{j}{}_{l}\\ \hline b^\color{orange}{1}{}_{\color{orange}{1}} & a_{\color{red}{11}} b_{\color{orange}{11}} & a_{\color{red}{22}} b_{\color{orange}{11}}\\ b^\color{orange}{1}{}_{\color{orange}{2}}&a_{\color{red}{11}} b_{\color{orange}{12}} & a_{\color{red}{22}} b_{\color{orange}{12}}\\ b^\color{orange}{2}{}_{\color{orange}{1}}&a_{\color{red}{11}} b_{\color{orange}{21}} & a_{\color{red}{22}} b_{\color{orange}{21}}\\ b^\color{orange}{2}{}_{\color{orange}{2}}&a_{\color{red}{11}} b_{\color{orange}{22}} & a_{\color{red}{22}} b_{\color{orange}{22}} \end{array} \]

which can be arranged in matrix form as

\[ \begin{bmatrix} \hbox{Tr}\begin{bmatrix}a_{\color{red}{1,1}}b_{1,1} & a_{1,2}b_{1,1}\\ a_{2,1}b_{1,1} & a_{\color{red}{2,2}}b_{1,1}\end{bmatrix} & \hbox{Tr}\begin{bmatrix}a_{\color{red}{1,1}}b_{1,2} & a_{1,2}b_{1,2} \\ a_{2,1}b_{1,2} & a_{\color{red}{2,2}}b_{1,2} \end{bmatrix} \\ \hbox{Tr}\begin{bmatrix}a_{\color{red}{1,1}}b_{2,1} & a_{1,2}b_{2,1}\\ a_{2,1}b_{2,1} & a_{\color{red}{2,2}}b_{2,1}\end{bmatrix} & \hbox{Tr}\begin{bmatrix}a_{\color{red}{1,1}}b_{2,2} & a_{1,2}b_{2,2}\\ a_{2,1}b_{2,2} & a_{\color{red}{2,2}}b_{2,2}\end{bmatrix} \end{bmatrix}=\begin{bmatrix} a_{\color{red}{1,1}}b_{1,1} + a_{\color{red}{2,2}}b_{1,1} & a_{\color{red}{1,1}}b_{1,2} + a_{\color{red}{2,2}}b_{1,2} \\ a_{\color{red}{1,1}}b_{2,1} + a_{\color{red}{2,2}}b_{2,1} & a_{\color{red}{1,1}}b_{2,2} + a_{\color{red}{2,2}}b_{2,2} \end{bmatrix}\]


Answer:

First a remark on the side: the distinction between upper and lower indices does not mean much in this context where we are not dealing with a metric; only the order of the indices matters.

The result of contracting over the first and fourth indices can be written as a square \(2\times2\) matrix composed of four traces of \(2\times2\) submatrices like in the previous question; for example, the \((j=1,k=1)\) component of the contraction is the trace of the submatrix consisting of elements in rows \(1\) and \(3\) and in columns \(1\) and \(2\) of the block matrix:

\[\small T = A \otimes B = \begin{bmatrix} \begin{bmatrix}\color{red}{a_{{1,1}}b_{1,1}} & \color{red}{a_{1,1}b_{1,2}}\\ a_{1,1}b_{2,1} & a_{{1,1}}b_{2,2}\end{bmatrix} & \begin{bmatrix}a_{{1,2}}b_{1,1} & a_{1,2}b_{1,2} \\ a_{1,2}b_{2,1} & a_{{1,2}}b_{2,2} \end{bmatrix} \\ \begin{bmatrix}\color{red}{a_{{2,1}}b_{1,1}} & \color{red}{a_{2,1}b_{1,2}}\\ a_{2,1}b_{2,1} & a_{{2,1}}b_{2,2}\end{bmatrix} & \begin{bmatrix}a_{{2,2}}b_{1,1} & a_{2,2}b_{1,2}\\ a_{2,2}b_{2,1} & a_{{2,2}}b_{2,2}\end{bmatrix} \end{bmatrix} \]

\[a_{1,1}b_{1,1}+a_{2,1}b_{1,2}=\hbox{Tr} \begin{bmatrix} a_{1,1}b_{1,1} & a_{1,1}b_{1,2} \\ a_{2,1}b_{1,1} & a_{2,1}b_{1,2} \\ \end{bmatrix}\]

For these two indices in particular, you correctly note that it can be interpreted as a matrix product but it may appear awkward that you have to write the transposes of the original matrices, and the result is transposed as well. These transpositions can be avoided if you switch the order of the matrices:

\[Y^{jk}= \begin{bmatrix} b_{1,1} & b_{1,2} \\ b_{2,1} & b_{2,2} \\ \end{bmatrix} \begin{bmatrix} a_{1,1} & a_{1,2} \\ a_{2,1} & a_{2,2} \\ \end{bmatrix}= \begin{bmatrix} a_{1,1}b_{1,1}+a_{2,1}b_{1,2} & a_{1,2}b_{1,1}+a_{2,2}b_{1,2} \\ a_{1,1}b_{2,1}+a_{2,1}b_{2,2} & a_{1,2}b_{2,1}+a_{2,2}b_{2,2} \end{bmatrix} \]

If you would contract over the middle indices \(j\) and \(k\) instead of \(i\) and \(l\) then the result corresponds to the usual definition of a matrix product without changing the order.


Comments:

Frankly, I still don’t know if your convenient re-arrangement of the bock matrices on the prior answer was completely ad hoc to accommodate my request of keeping a matrix / basic lin alg solution, or followed a more systematic rearrangement (I’d be very interested in it). But in any case, I applied the first principle of ad hoc reshuffling to make it work - it is much more elegant now with the switching of the order in your answer.

There is always some arbitrariness in representing a rank 4 tensor as a two-dimensional row/column arrangement. Two of the indices will indicate rows and columns within 2×2 submatrices, and the two other indices will index the submatrices themselves in the larger arrangement. If we know in advance that we are going to contract over 2 of the 4 indices than it helps to choose those indices to represent the dimensions of the inner 2×2 matrices.

Right! The tidy packaging is also constraining - its order in rows and cols functions as a proxy for the basis vectors, but typically, this would be part of the tensor in a less constraining notation, such as \(Y^{jk}e_j\otimes e_k\).

At the end of these couple of questions / answers, it seems that there is no difference in the mechanics between contracting a tensor or tensor multiplying (for instance a tensor \(\otimes\) (eating) a vector).

Another final reflection (slash prompt for you to comment upon) is that the reason why both this and the contraction in the prior question were boiled down to a sum (trace) is because the pair-wise multiplication had already occurred in the Kronecker ⊗.

Only that I agree with your last remarks. Matrix multiplication can be viewed very naturally as a contraction of a tensor product.


Toy example calculating the Ricci tensor as a contraction of the Riemann curvature tensor:

Riemann Curvature Tensor: 2-dimensional space with the following non-zero components of the Riemann curvature tensor \(R^\rho_{\sigma\mu\nu}\): \(R^1_{212} = 1\), \(R^2_{121} = -1\)

Contraction to Ricci Tensor: The Ricci tensor \(R_{\sigma\nu}\) is obtained by contracting the first and third indices of the Riemann tensor: \(R_{\rho\mu} = R^\rho_{\mu\rho\nu}\) For this example, we need to sum over the repeated index \(\rho\).

Calculate Ricci Tensor Components:

\[ R_{11} = R^1_{111} + R^2_{121}=0-1=-1\] \[ R_{12} = R^1_{112} + R^2_{122}=0\] \[R_{21} = R^1_{211} + R^2_{221}=0\] \[R_{22} = R^1_{212} + R^2_{222} =1\] The Ricci tensor \[R_{\mu\nu}\] is: \[R_{\sigma\nu} = \begin{bmatrix} -1 & 0 \\ 0 & 1 \end{bmatrix}\]


From this question:

If we have to (co)-vectors in \(V^*:\)

\[\beta=\begin{bmatrix}1 &2 &3 \end{bmatrix}\] and \[\gamma=\begin{bmatrix}2 &4 &6 \end{bmatrix}\]

the \((2,0)\)-tensor \(\beta\otimes \gamma\) is the outer product:

\[\beta\otimes_o \gamma=\begin{bmatrix}2\,e^1\otimes e^1&4\,e^1\otimes e^2&6\,e^1\otimes e^3\\4\,e^2\otimes e^1&8\,e^2\otimes e^2&12\,e^2\otimes e^3\\6\,e^3\otimes e^1&12\,e^3\otimes e^2&18\,e^3\otimes e^3\end{bmatrix}\]

Now if apply this tensor product on the vectors

\[v=\begin{bmatrix}1\\-1\\5\end{bmatrix}, \; w = \begin{bmatrix}2\\0\\3\end{bmatrix}\]

\[\begin{align} (\beta \otimes \gamma)[v,w]&=\\[2ex] & 2 \times 1 \times 2 \quad+\quad 4 \times 1 \times 0 \quad +\quad 6 \times 1 \times 3 \\ +\;&4 \times -1 \times 2 \quad + \quad 8 \times -1 \times 0 \quad + \quad 12 \times -1 \times 3 \\ +\;&6 \times 5 \times 2 \quad + \quad 12 \times 5 \times 0 \quad + \quad 18 \times 5 \times 3 \\[2ex] &= 308\end{align}\]

Is this correct?

Yes, this is correct. The answer should be \(\left<\beta,v\right> \cdot \left<\gamma,w\right>\), where \(\langle\,,\,\rangle\) is the usual inner product:

\[\vec \beta \cdot \vec v \times \vec \gamma \cdot \vec w = 308.\]

v = c(1,-1,5); w = c(2,0,3); beta = 1:3; gamma = c(2,4,6); beta %*% v * gamma %*% w
##      [,1]
## [1,]  308

From this source:



Applied to the case in this question, the change of basis matrix is \(\small\begin{bmatrix}3&4&-1\\0&3&7\\1&3&0.5\end{bmatrix}\), and its inverse \(\small\begin{bmatrix}0.7&0.2&-1.1\\-0.3&-0.1&0.8\\0.1&0.2&-0.3\end{bmatrix}\). The vectors \(v\) and \(w\) in the new coordinate system are

\(v =\small\begin{bmatrix}0.7&0.2&-1.1\\-0.3&-0.1&0.8\\0.1&0.2&-0.3\end{bmatrix}\begin{bmatrix}1\\2\\3\end{bmatrix} =\begin{bmatrix}-2.3\\1.9\\-0.5\end{bmatrix}\) and \(w=\small\begin{bmatrix}0.7&0.2&-1.1\\-0.3&-0.1&0.8\\0.1&0.2&-0.3\end{bmatrix}\begin{bmatrix}1\\0\\0\end{bmatrix}=\begin{bmatrix}0.7\\-0.3\\0.1\end{bmatrix}\).

Therefore,

\[\begin{align}\large v\otimes w=\left(-.23\tilde x + 1.9\tilde y -0.5 \tilde z\right)\otimes \left(0.7\tilde x -0.3\tilde y + 0.1\tilde z\right)\\[2ex]=-1.6\;\tilde x\otimes \tilde x + 1.3\;\tilde x\otimes \tilde y -0.3 \;\tilde x\otimes \tilde z + 0.6\;\tilde y\otimes \tilde x -0.5\;\tilde y\otimes \tilde y+ 0.1\;\tilde y\otimes \tilde z -0.3\;\tilde z\otimes \tilde x +0.2 \;\tilde z\otimes \tilde y-0.1\;\tilde z\otimes \tilde z\end{align}\]

So what’s the point?

Starting off defining the tensor product of two vector spaces (\(V\otimes W\)) with the same bases, we end up calculating the outer product of two vectors:

\[\large v\otimes_o w=\small \begin{bmatrix}-2.3\\1.9\\-0.5\end{bmatrix}\begin{bmatrix}0.7&-0.3&0.1\end{bmatrix}=\begin{bmatrix}-1.6&1.3&-0.3\\0.6&-0.5&0.1\\-0.3&0.2&-0.1\end{bmatrix}\]

This connect this post to this more general question.


From the answer to this question about Spivak’s definition of tensor product as:

\[(S \otimes T)(v_1,...v_n,v_{n+1},...,v_{n+m})= S(v_1,...v_n) * T(v_{n+1},...,v_{n+m})\]

Let’s first set some terminology.

Let \(V\) be an \(n\)-dimensional real vector space, and let \(V^*\) denote its dual space. We let \(V^k = V \times \cdots \times V\) (\(k\) times).

  • A tensor of type \((r,s)\) on \(V\) is a multilinear map \(T\colon V^r \times (V^*)^s \to \mathbb{R}\).

  • A covariant \(k\)-tensor on \(V\) is a multilinear map \(T\colon V^k \to \mathbb{R}\).

In other words, a covariant \(k\)-tensor is a tensor of type \((k,0)\). This is what Spivak refers to as simply a “\(k\)-tensor.”

  • A contravariant \(k\)-tensor on \(V\) is a multilinear map \(T\colon (V^*)^k\to \mathbb{R}\).

In other words, a contravariant \(k\)-tensor is a tensor of type \((0,k)\).

  • We let \(T^r_s(V)\) denote the vector space of tensors of type \((r,s)\). So, in particular,

\[\begin{align*} T^k(V) := T^k_0(V) & = \{\text{covariant $k$-tensors}\} \\ T_k(V) := T^0_k(V) & = \{\text{contravariant $k$-tensors}\}. \end{align*}\] Two important special cases are: \[\begin{align*} T^1(V) & = \{\text{covariant $1$-tensors}\} = V^* \\ T_1(V) & = \{\text{contravariant $1$-tensors}\} = V^{**} \cong V. \end{align*}\] This last line means that we can regard vectors \(v \in V\) as contravariant 1-tensors. That is, every vector \(v \in V\) can be regarded as a linear functional \(V^* \to \mathbb{R}\) via \[v(\omega) := \omega(v),\] where \(\omega \in V^*\).

The rank of an \((r,s)\)-tensor is defined to be \(r+s\).

In particular, vectors (contravariant 1-tensors) and dual vectors (covariant 1-tensors) have rank 1.


If \(S \in T^{r_1}_{s_1}(V)\) is an \((r_1,s_1)\)-tensor, and \(T \in T^{r_2}_{s_2}(V)\) is an \((r_2,s_2)\)-tensor, we can define their tensor product \(S \otimes T \in T^{r_1 + r_2}_{s_1 + s_2}(V)\) by

\[(S\otimes T)(v_1, \ldots, v_{r_1 + r_2}, \omega_1, \ldots, \omega_{s_1 + s_2}) = \\ S(v_1, \ldots, v_{r_1}, \omega_1, \ldots,\omega_{s_1})\cdot T(v_{r_1 + 1}, \ldots, v_{r_1 + r_2}, \omega_{s_1 + 1}, \ldots, \omega_{s_1 + s_2}).\]

Taking \(s_1 = s_2 = 0\), we recover Spivak’s definition as a special case.

Example: Let \(u, v \in V\). Again, since \(V \cong T_1(V)\), we can regard \(u, v \in T_1(V)\) as \((0,1)\)-tensors. Their tensor product \(u \otimes v \in T_2(V)\) is a \((0,2)\)-tensor defined by \[(u \otimes v)(\omega, \eta) = u(\omega)\cdot v(\eta)\]


As I suggested in the comments, every bilinear map – i.e. every rank-2 tensor, be it of type \((0,2)\), \((1,1)\), or \((2,0)\) – can be regarded as a matrix, and vice versa.

Admittedly, sometimes the notation can be constraining. That is, we’re used to considering vectors as column vectors, and dual vectors as row vectors. So, when we write something like \[u^\top A v,\] our notation suggests that \(u^\top \in T^1(V)\) is a dual vector and that \(v \in T_1(V)\) is a vector. This means that the bilinear map \(V \times V^* \to \mathbb{R}\) given by \[(v, u^\top) \mapsto u^\top A v\] is a type \((1,1)\)-tensor.

Example: Let \(V = \mathbb{R}^3\). Write \(u = (1,2,3) \in V\) in the standard basis, and \(\eta = (4,5,6)^\top \in V^*\) in the dual basis. For the inputs, let’s also write \(\omega = (x,y,z)^\top \in V^*\) and \(v = (p,q,r) \in V\). Then \[\begin{align*} (u \otimes \eta)(\omega, v) & = u(\omega) \cdot \eta(v) \\ & = \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix} (x,y,z) \cdot (4,5,6) \begin{pmatrix} p \\ q \\ r \end{pmatrix} \\ & = (x + 2y + 3z)(4p + 5q + 6r) \\ & = 4px + 5 qx + 6rx \\ & \ \ \ \ \ 8py + 10qy + 12py \\ & \ \ \ \ \ 12pz + 15qz + 18rz \\ & = (x,y,z)\begin{pmatrix} 4 & 5 & 6 \\ 8 & 10 & 12 \\ 12 & 15 & 18 \end{pmatrix}\begin{pmatrix} p \\ q \\ r \end{pmatrix} \\ & = \omega \begin{pmatrix} 4 & 5 & 6 \\ 8 & 10 & 12 \\ 12 & 15 & 18 \end{pmatrix} v. \end{align*}\]

Conclusion: The tensor \(u \otimes \eta \in T^1_1(V)\) is the bilinear map \((\omega, v)\mapsto \omega A v\), where \(A\) is the \(3 \times 3\) matrix above.

The Wikipedia article you linked to would then regard the matrix \(A\) as being equal to the tensor product \(u \otimes \eta\).


Finally, I should point out two things that you might encounter in the literature.

First, some authors take the definition of an \((r,s)\)-tensor to mean a multilinear map \(V^s \times (V^*)^r \to \mathbb{R}\) (note that the \(r\) and \(s\) are reversed). This also means that some indices will be raised instead of lowered, and vice versa. You’ll just have to check each author’s conventions every time you read something.

Second, note that there is also a notion of tensor products of vector spaces. Many textbooks, particularly ones focused on abstract algebra, regard this as the central concept. I won’t go into this here, but note that there is an isomorphism \[T^r_s(V) \cong \underbrace{V^* \otimes \cdots \otimes V^*}_{r\text{ copies}} \otimes \underbrace{V \otimes \cdots \otimes V}_{s \text{ copies}}.\]

Confusingly, some books on differential geometry define the tensor product of vector spaces in this way, but I think this is becoming rarer.


Home Page

NOTE: These are tentative notes on different topics for personal use - expect mistakes and misunderstandings.