NOTES ON STATISTICS, PROBABILITY and MATHEMATICS


Tensor Product and Multilinear Maps:


In the Wikipedia example of the tensor product vector spaces, included in my OP, as well as in my previous post here, the tensor product is of the form \(V\otimes V,\) a \((0,2)\) tensor, and results in a form akin to \((1)\) in the OP:

\[A^0 B^0 e_0 \otimes e_0 + A^0 B^1 e_0 \otimes e_1 + \cdots + A^4 B^4 e_4 \otimes e_4\]

equivalent to an outer product, as illustrated in this post:

The tensor product of two vectors \(v\in V\) and \(w \in W\), i.e. \((V\otimes W)\) is akin to calculating the outer product of two vectors:

\[\large v\otimes_o w=\small \begin{bmatrix}-2.3\;e_1\\+1.9\;e_2\\-0.5\;e_3\end{bmatrix}\begin{bmatrix}0.7\;e_1&-0.3\;e_2&0.1\;e_3\end{bmatrix}= \begin{bmatrix}-1.61\;e_1\otimes e_1&+0.69\;e_1\otimes e_2&-0.23\;e_1\otimes e_3\\+1.33\;e_2 \otimes e_1&-0.57\;e_2 \otimes e_2&+0.19\;e_2 \otimes e_3\\-0.35\;e_3 \otimes e_1&+0.15\;e_3 \otimes e_2&-0.05\;e_3 \otimes e_3\end{bmatrix}\]


This is equivalent to the tensor product space \(V^*\otimes V^*\) (the set of all tensor \((2,0))\) on the slide in the OP. The presenter is tensor-multiplying two co-vectors in the vector basis of \(V^*\), without coefficients, yielding the \(16\) pairs of basis vectors of \(V^*\otimes V^*\): \[\{e^0\otimes e^0, \; e^0\otimes e^1, \; e^0\otimes e^3, \;\cdots, e^4\otimes e^4\}.\]


The key is to distinguish these forms of tensor product of vector spaces from their application to other vectors (or covectors), i.e. when the \[\langle\beta_\mu\,e^\mu\;,\;A^\nu\,e_\nu \rangle\;=\beta_\mu\,A^\nu\,\langle e^\mu\;,\;e_\nu\rangle \;=\beta_\mu\,A^\nu\,\delta^\mu_{\;\nu}\;=\beta_\mu\,A^\mu\;\in \mathbb R\] operations are carried out, yielding a real number - which is what is explained in the video.

These linear mappings \(\beta\otimes\gamma:V\times V \to \mathbb R\) properly interpreted as \([\beta\otimes\gamma](v,w)=\langle \beta,v\rangle\langle\gamma,w\rangle\) (i.e. the tensor \(\beta\otimes\gamma\) acting on two vectors, \(v\) and \(w\)) would correct the \((2)\) part of the OP (after Professor Shifrin’s answer) as:

\(\begin{align} &(\beta\otimes\gamma)\left(\sum A^\mu e_\mu,\sum B^\nu e_\nu\right)= \\[2ex] &=\left [ \beta_0\gamma_0\;e^0\otimes e^0+ \; \beta_0\gamma_1\;e^0\otimes e^1+ \;\beta_0\gamma_3\; e^0\otimes e^3+\cdots+ \;\beta_4\gamma_4\; e^4\otimes e^4 \right]\,\small{\left(\sum A^\mu e_\mu,\sum B^\nu e_\nu\right) } \\[2ex] &= \beta_0\gamma_0 A^\mu B^\nu \langle e^0,e_\mu \rangle \; \langle e^0,e_\nu \rangle \; + \; \beta_0\gamma_1 A^\mu B^\nu \langle e^0,e_\mu \rangle \; \langle e^1,e_\nu \rangle +\cdots +\beta_4\gamma_4 A^\mu B^\nu \langle e^4,e_\mu \rangle \; \langle e^4,e_\nu \rangle \\[2ex] &=\beta_0\gamma_0 A^\mu B^\nu \; \delta^0_{\;\mu}\; \delta^0_{\;\nu} \; + \; \beta_0\gamma_1 A^\mu B^\nu \; \delta^0_{\;\mu}\; \delta^1_{\;\nu} +\cdots +\beta_4\gamma_4 A^\mu B^\nu \; \delta^4_{\;\mu}\; \delta^4_{\;\nu} \\[2ex]&= \sum \beta_\mu\gamma_\nu A^\mu B^\nu \end{align}\)

indeed a real number, exemplifying the mapping \(V\times V \to \mathbb R.\) The tensor is defined as

\[\begin{align}\beta\otimes \gamma&:= \beta_0\gamma_0\, e^0\otimes e^0+\beta_0\gamma_1\, e^0\otimes e^1 + \beta_0\gamma_2\, e^0\otimes e^2+\cdots+\beta_3\gamma_3\, e^3\otimes e^3\\[2ex] &=T_{00}\, e^0\otimes e^0+T_{01}\, e^0\otimes e^1 + T_{02}\, e^0\otimes e^2+\cdots+T_{33}\, e^3\otimes e^3\\[2ex] &= T_{\mu\nu}\,e^\mu\otimes\,e^\nu \end{align}\]


Tensor Product as the Kronecker product:


As an example, I believe we could illustrate this as follows:

\(\beta \in V^*\) is \(\beta=\color{blue}{\begin{bmatrix}\sqrt{\pi} & \sqrt[3]{\pi} &\sqrt[5]{\pi} \end{bmatrix}}\) and \(\gamma\in V^*\) is \(\gamma=\color{red}{\begin{bmatrix}\frac{1}{3} &\frac{1}{5} &\frac{1}{7} \end{bmatrix}}\). The \((2,0)\)-tensor \(\beta\otimes \gamma\) is the outer product:

\[\begin{align}\beta\otimes_o \gamma&= \begin{bmatrix}\color{blue}{\sqrt\pi}\times \color{red}{\frac{1}{3}}\quad e^1\otimes e^1 &\color{blue}{\sqrt\pi}\times\color{red}{\frac{1}{5}}\quad e^1\otimes e^2 &\color{blue}{\sqrt\pi}\times\color{red}{\frac{1}{7}}\quad e^1\otimes e^3\\ \color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{3}}\quad e^2\otimes e^1 &\color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{5}}\quad e^2\otimes e^2 &\color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{7}}\quad e^2\otimes e^3 \\\color{blue}{\sqrt[5]{\pi}}\times\color{red}{\frac{1}{3}}\quad e^3\otimes e^1 &\color{blue}{\sqrt[5]{\pi}}\times\color{red}{\frac{1}{5}}\quad e^3\otimes e^2 &\color{blue}{\sqrt[5]{\pi}}\times \color{red}{\frac{1}{7}}\quad e^3\otimes e^3\end{bmatrix}\\[2ex] &=\begin{bmatrix}\color{red}{\frac{1}{3}}\color{blue}{\sqrt\pi}\quad e^1\otimes e^1&\color{red}{\frac{1}{5}}\color{blue}{\sqrt\pi}\quad e^1\otimes e^2&\color{red}{\frac{1}{7}}\color{blue}{\sqrt\pi}\quad e^1\otimes e^3\\\color{red}{\frac{1}{3}}\color{blue}{\sqrt[3]{\pi}}\quad e^2\otimes e^1&\color{red}{\frac{1}{5}}\color{blue}{\sqrt[3]{\pi}}\quad e^2\otimes e^2&\color{red}{\frac{1}{7}}\color{blue}{\sqrt[3]{\pi}}\quad e^2\otimes e^3\\\color{red}{\frac{1}{3}}\color{blue}{\sqrt[5]{\pi}}\quad e^3\otimes e^1&\color{red}{\frac{1}{5}}\color{blue}{\sqrt[5]{\pi}}\quad e^3\otimes e^2&\color{red}{\frac{1}{7}} \color{blue}{\sqrt[5]{\pi}}\quad e^3\otimes e^3\end{bmatrix} \end{align}\]

Now if we apply this tensor product on the vectors

\[v=\color{magenta}{\begin{bmatrix}1\\7\\5\end{bmatrix}}, w = \color{orange}{\begin{bmatrix}2\\0\\3\end{bmatrix}}\]

\[\begin{align} (\beta \otimes \gamma)[v,w]=&\\[2ex] & \;\color{blue}{\sqrt\pi}\times \color{red}{\frac{1}{3}} \times \color{magenta} 1 \times \color{orange}2 \quad+\quad \color{blue}{\sqrt\pi}\times\color{red}{\frac{1}{5}} \times \color{magenta}1 \times \color{orange} 0 \quad+\quad \color{blue}{\sqrt\pi}\times\,\color{red}{\frac{1}{7}} \times \color{magenta}1 \times \color{orange}3 \\ + &\;\color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{3}} \times \color{magenta}{7} \times \color{orange}2 \quad+\quad \color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{5}} \times \color{magenta}{7} \times \color{orange}0 \quad+\quad \color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{7}} \times \color{magenta}{7} \times \color{orange}3 \\ \;+ &\;\color{blue}{\sqrt[5]{\pi}}\times\color{red}{\frac{1}{3}} \times \color{magenta} 5 \times \color{orange}2 \quad+\quad \color{blue}{\sqrt[5]{\pi}}\times\color{red}{\frac{1}{5}} \times \color{magenta} 5 \times \color{orange}0 \quad+\quad \color{blue}{\sqrt[5]{\pi}}\times \color{red}{\frac{1}{7}} \times \color{magenta}5 \times \color{orange}3 \\[2ex] =&\\ & \color{blue}{\sqrt{\pi}}\;\times\color{magenta} 1 \quad\left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange} 0 \quad+\quad \color{red}{\frac{1}{7}} \times \color{orange}3\right) \\ + &\,\color{blue}{\sqrt[3]\pi} \times \color{magenta}{7}\quad\left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange}0 \quad+\quad \color{red}{\frac{1}{7}} \times \color{orange}3\right) \\ \;+ &\,\color{blue}{\sqrt[5]{\pi}}\times \color{magenta} 5\quad\left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange}0 \quad+\quad \color{red}{\frac{1}{7}} \times \color{orange}3 \right)\\[2ex] =&\\&\small \left(\color{blue}{\sqrt\pi} \times \color{magenta} 1 \quad+\quad \color{blue}{\sqrt[3]\pi} \times \color{magenta}{7} \quad +\quad \color{blue}{\sqrt[5]\pi} \times \color{magenta}5 \right) \times \left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange} 0 \quad +\quad \color{red}{\frac{1}{7}} \times \color{orange} 3 \right)\\[2ex] =&\\[2ex]&\langle \color{blue}\beta,\color{magenta}v \rangle \times \langle \color{red}\gamma,\color{orange}w \rangle\\[2ex] =& 20.05487\end{align}\]

The elements of the first vector, \(v,\) multiply separate rows of the outer product \(\beta \otimes_o \gamma,\) while the elements of the second vector \(w\) multiply separate columns. Hence, the operation is not commutative.

Here is the idea with R code:

> v = c(1,7,5); w = c(2,0,3); beta=c(pi^(1/2),pi^(1/3),pi^(1/5)); gamma = c(1/3,1/5,1/7)
> sum(((beta %o% gamma) * v) %*% w) # same as sum((beta %*% t(gamma) * v) %*% w)
[1] 20.05487
> sum(((beta %o% gamma) * w) %*% v) # not a commutative operation:
[1] 17.90857

Or more simply, \(\vec \beta \cdot \vec v \times \vec \gamma \cdot \vec w = 308\)

\[\begin{align} (\beta \otimes \gamma)[v,w]&=\langle \beta,v \rangle \times \langle \gamma,w \rangle\\[2ex] & =\small \left(\color{blue}{\sqrt\pi} \times \color{magenta} 1 \quad+\quad \color{blue}{\sqrt[3]\pi} \times \color{magenta}{7} \quad +\quad \color{blue}{\sqrt[5]\pi} \times \color{magenta}5 \right) \times \left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange} 0 \quad +\quad \color{red}{\frac{1}{7}} \times \color{orange} 3 \right) \\[2ex] &=18.31097\times 1.095238\\[2ex] &= 20.05487\end{align}\]

> v = c(1,7,5); w = c(2,0,3); beta=c(pi^(1/2),pi^(1/3),pi^(1/5)); gamma = c(1/3,1/5,1/7)
> beta %*% v * gamma %*% w
         [,1]
[1,] 20.05487

Does it obey bilinearity?

\[(\beta\otimes \gamma)[v,w]\overset{?}=(\beta\otimes \gamma)\Bigg[\left(\frac{1}{5}v\right),\left(5\,w\right)\Bigg] \]

> v_prime = 1/5 * v
> w_prime = 5 * w
> beta %*% v_prime * gamma %*% w_prime
         [,1]
[1,] 20.05487   #Check!

\[(\beta\otimes \gamma)[v, u + w]\overset{?}=(\beta\otimes \gamma)[v,u] + (\beta\otimes \gamma)[v,w] \]

> u = c(-2, 5, 9)    # Introducing a new vector...
> beta %*% v * gamma %*% (u + w)
        [,1]
[1,] 49.7012
> (beta %*% v * gamma %*% u) + (beta %*% v * gamma %*% w)
        [,1]
[1,] 49.7012 #... And check!

But the evaluation of the vectors is not commutative:

    v = c(1,7,5); w = c(2,0,3); beta=c(pi^(1/2),pi^(1/3),pi^(1/5)); gamma = c(1/3,1/5,1/7)
    beta %*% w * gamma %*% v
##          [,1]
## [1,] 17.90857
    beta %*% v * gamma %*% w
##          [,1]
## [1,] 20.05487

Motivating Examples:

From this Quora answer:

Any vector space imbued with an inner product has a natural \(T^0_2=(\underbrace{0}_{\text{takes 0 cov's}},\underbrace{2}_{\text{takes 2 vec's}})\)-tensor sitting there: the inner product itself: it linearly takes in a pair of vectors and spits out their inner product, an element of the base field.

Similarly, any linear transformation of a vector space acts naturally as a \(T^1_1=(\underbrace{1}_{\text{takes 1 cov}},\underbrace{1}_{\text{takes 1 vec}})\)-tensor.

An example of a higher-order tensor is the determinant: given any linear transformation \(A\), from a vector space (of dimension \(n\)) to itself, \(\det(A)\) is a \(T^0_n=(\underbrace{0}_{\text{takes 0 cov's}},\underbrace{n}_{\text{takes n vec's}})\)-tensor: \(\det(A) (v_1,\cdots, v_N) = (A(v_1)) \land (A(v_2)) \land\cdots\land (A(v_N))\), where “\(\land\)” is the fully-antisymmetrized tensor product (the “wedge” product).

And of course as others have mentioned, differential topology and geometry are littered with tensors (and tensor fields / densities).


Practically the same, but more mathy:

\(\newcommand{\Reals}{\mathbf{R}}\newcommand{\Basis}{\mathbf{e}}\newcommand{\Brak}[1]{\left\langle #1\right\rangle}\)Let \((\Basis_{j})_{j=1}^{n}\) denote the standard basis of \(V = \Reals^{n}\) and let \((\Basis^{i})_{i=1}^{n}\) be the dual basis of \(V^{*} = (\Reals^{n})^{*}\). (Where possible below, I’ve taken case to use the dummy indices \(i\) and \(j\) “globally”.)

  • The identity transformation \(I_{n}:\Reals^{n} \to \Reals^{n}\) is \[ \sum_{i,j=1}^{n} \delta_{i}^{j}\, \Basis_{j} \otimes \Basis^{i} = \sum_{j=1}^{n} \Basis_{j} \otimes \Basis^{j}. \] Specifically, if \(v = \sum\limits_{j=1}^{n} v^{j} \Basis_{j}\), then \[ I_{n}(v) = \sum_{j=1}^{n} \Basis_{j} \otimes \Basis^{j}(v) = \sum_{j=1}^{n} v^{j}\Basis_{j} = v. \] Similarly, if \(A = [a_{i}^{j}]\) is an \(n \times n\) matrix, the tensor \[ T = \sum_{i,j=1}^{n} a_{i}^{j}\, \Basis_{j} \otimes \Basis^{i} \in T_{1}^{1}\Reals^{n} \] is the linear operator whose standard matrix is \(A\).

    If \(\Basis_{j}\) is written as an \(n \times 1\) column matrix with a \(1\) in the \(j\)th row and \(0\)’s elsewhere, then \(\Basis^{i}\) is the \(1 \times n\) row matrix with a \(1\) in the \(i\)th column and \(0\)’s elsewhere, and the tensor product \(\Basis_{j}^{i} := \Basis_{j} \otimes \Basis^{i}\) may be denoted with ordinary matrix multiplication, the outer product of a column and a row, the \(n \times n\) matrix with a \(1\) in the \((i, j)\)-entry and \(0\)’s elsewhere.

  • The Euclidean inner product is \[ \Brak{\ ,\ } = \sum_{i=1}^{n} \Basis^{i} \otimes \Basis^{i}. \] If \(u\) and \(v\) are arbitrary vectors, then \[ \Brak{u, v} = \sum_{i=1}^{n} \Basis^{i}(u)\, \Basis^{i}(v) = \sum_{i=1}^{n} u^{i}\, v^{i}. \]

  • If \(n = 2\), the determinant viewed as a bilinear function of two vectors in \(\Reals^{2}\) is \[\begin{align*} \det &= \Basis^{1} \otimes \Basis^{2} - \Basis^{2} \otimes \Basis^{1} \in T_{2}^{0} \Reals^{2}; \\ \det(u, v) &= \Basis^{1}(u)\, \Basis^{2}(v) - \Basis^{2}(v)\, \Basis^{1}(u) \\ &= u^{1} v^{2} - u^{2} v^{1}. \end{align*}\]

  • Similarly, if \(n = 3\), the ordinary cross product is

\[\begin{align}&(\Basis^{2} \otimes \Basis^{3} - \Basis^{3} \otimes \Basis^{2})\otimes \Basis_{1} \\ + &(\Basis^{3} \otimes \Basis^{1} - \Basis^{1} \otimes \Basis^{3})\otimes \Basis_{2} \\ + &(\Basis^{1} \otimes \Basis^{2} - \Basis^{2} \otimes \Basis^{1})\otimes \Basis_{3} \in T_{2}^{1} \Reals^{3}. \end{align}\]


From this answer on Quora:

Tensors are useful when you’ve got a whole lot of coordinates that are related to each other in some structured way. A simple vector like \(\mathbf v=(1,4,2)\) is an example of a tensor, but it’s simple enough so that you don’t need to understand tensors in general to understand vectors. Likewise, matrices are examples of tensors, but again, they’re best understood just as matrices. The best way to understand a mathematical construct is not individually, but in context of the set (or space) of all the constructs of that type. Rather than trying to define a number, instead define what a field of numbers is; instead of defining what a vector is, consider instead all the vectors that make up a vector space. So to understand tensors of a particular type, instead consider all those tensors of the same type together.

Covariant tensor products

The simplest tensors are vectors, so we’ll build tensors up from vector spaces, which I’ll assume we already know about. Suppose we’ve got two vector spaces \(V\) and \(W\) over a field \(F\). You can start with more than two, but so I don’t have to use indices to begin with, I’ll take just two. You can join them together to get their tensor product \(V\otimes W\) which will be another vector space.
The individual elements of \(V\otimes W\) are named as linear combinations of elements of the form \(\mathbf v\otimes\mathbf w\) where \(\mathbf v\in V\) and \(\mathbf w\in W\). Since they’re linear combinations, if you have \(k\) scalars \(a_1,\ldots,a_k\) in \(F\), vectors \(\mathbf v_1,\ldots,\mathbf v_k\) in \(V\), and vectors vectors \(\mathbf w_1,\ldots,\mathbf w_k\) in \(W\), then \(\displaystyle\sum_{i=1}^k a_i\mathbf v_i\otimes\mathbf w_i=a_1\mathbf v_1\otimes\mathbf w_1+\cdots+a_k\mathbf v_k\otimes\mathbf w_k\) is a typical tensor in \(V\otimes W\). But there’s a requirement that we make on the symbol \(\otimes\), and that’s that it be linear in each coordinate. So we require that \((a_1\mathbf v_1+a_2\mathbf v_2)\otimes\mathbf w=a_1\mathbf v_1\otimes\mathbf w+a_2\mathbf v_2\otimes\mathbf w\) and \(\mathbf v\otimes (a_1\mathbf w_1+a_2\mathbf w_2)=a_1\mathbf v\otimes\mathbf w_1+a_2\mathbf v\otimes\mathbf w_2\). That condition allows us to specify a basis for the vector space \(V\otimes W\) if we have bases for both \(V\) and \(W\). Suppose that \(V\) is a vector space of dimension \(m\) with basis \(\mathbf b_1,\ldots,\mathbf b_m\), and that \(W\) is a vector space of dimension \(n\) with basis \(\mathbf c_1,\ldots,\mathbf c_n\). Then \(V\otimes W\) is a vector space of dimension \(mn\) and a basis whose elements are \(\mathbf b_i\otimes\mathbf c_j\) where \(i\) varies from \(1\) through \(m\) and \(j\) varies from \(1\) through \(n\). That means a typical element of \(V\otimes W\) can be written as \(\displaystyle\sum_{i=1}^m\sum_{j=1}^n a_{ij}\mathbf b_i\otimes\mathbf c_j\) Rather than write this as an double sum with specified ranges for the indices, you can assume that those can be determined by context, and write the sum more simply as \(\displaystyle\sum_{ij} a_{ij}\mathbf b_i\otimes\mathbf c_j\) and if the bases of the component vector spaces \(V\) and \(W\) are fixed, they don’t have to be mentioned either, and the \(\sum\) symbol can be suppressed. That yields the greatly abbreviated notation \(a_{ij}\) for this tensor. Now, of course you can take a tensor product of more than two vector spaces. If you have three vector spaces with specified dimensions and bases, then a typical element of the triple tensor product would be expressed as \(a_{ijk}\). Although that’s a simple expression, remember that each of the three subscripts varies over a range, and it’s a triple sum where each \(a_{ijk}\) is a coefficient in that sum.

Dual vector spaces

If \(V\) is a vector space, there is a dual vector space \(V^{*}\) whose elements are linear transformations to the scalar field, \(V\to F\). If \(V\) has a finite basis \(\mathbf b_1,\ldots,\mathbf b_n\), then \(V^{*}\) is a vector space of the same dimension with a basis \(\mathbf b_1^{*},\ldots,\mathbf b_n^{*}\) where \(\mathbf b_i^{*}:V\to F\) is the transformation that sends the vector \(a_1\mathbf b_1+\cdots+a_n\mathbf b_n\) to the scalar \(a_i\). When writing the elements of \(V\) and \(V^{*}\) as vectors, you can write the elements of \(V\) as column vectors and \(V^{*}\) as row vectors (or vice versa).

Covariant and contravariant tensor products

These are tensor products where some of the component vector spaces are dual vector spaces. For those that are dual vector spaces, rather than using subscripts, superscripts are usually used. Take for example the tensor product \(V^{*}\otimes W\). A typical element would be written as \(a^i_j\). It is, as before, actually a sum. Here \(i\) is made a superscript because the tensor product is contravariant in \(V\), which is just another way of saying that the dual vector space \(V^{*}\) is being used instead of \(V\) itself.
You can think of \(a^i_j\) as being a linear transformation \(V\) to \(W\), that is, a matrix. In that way, an ordinary matrix is one of these tensors where one coordinate is contravariant.

Composition of tensors

If you have two tensor products, and a vector space V appears in one of the tensor products covariantly and the other contravariantly, then you can compose them \((V^{*}\otimes W)\times(V\otimes U)\to W\otimes U\) Given \(a^i_j\in V^{*}\otimes W\) and \(b_{ik}\in V\otimes U\), the result is \(\mathbf\sum_ia^i_jb_{ik}\in W\otimes U\). The summation sign in the result is usually suppressed so it’s written more simply as \(a^i_jb_{ik}\). More generally, if you have two complicated tensors where the same index appears as a superscript in one and a subscript in the other, they can be multiplied by writing them next to each other with an understood summation over that index. As an example, a matrix \(V^{*}\otimes W\) times a (column) vector in \(V\) gives a (column) vector in \(W\).


Coordinate-free Approach:

From this youtube lecture.

Preliminary definitions:

A field is a set \(k\) with the triad \((k, +, \cdot)\) denoting two maps:

\[+: k \times k \rightarrow k\]

and

\[\cdot: k\times k \rightarrow k\]

that satisfy

CANI conditions: closure and commutativity (abelian), associativity, neutral element, inverse for every element. CANI is also satisfied by the addition and multiplication operations, except for having to remove in the multiplication case the neutral element of addition \(k\{0\}\).


A ring \((R, +, \cdot)\) is similar but CANI only applies to \(+\). For the multiplication operation the inverses (and commutativity) gone. So every field is a ring.

Example: \((\mathbb Z, +, \cdot)\) - the inverse of an inverse is not in the set.

Example: \(m\times n\) matrices over \(\mathbb R\) are not commutative, and not all of them have inverses.


A \(k\) (\(k\) is a field) vector space \((V, \color{red}{+}, \color{red}{\cdot})\) has two operations defined, and different from the operations (\(+\) and \(\cdot\)) in the field, such that

\[+: V \times V \rightarrow V\]

and

\[\cdot: k\times V \rightarrow V\]

fulfilling CANI for the \(\color{red}{+}\) operation, but also ADDU: associativity, two distributive laws (over the \(+\) of the field elements; or over the \(\color{red}{+}\) of the vector sum operation), and scalar identity.

More formally,

  1. Associativity of addition
  2. Commutativity of addition
  3. Identity element of addition
  4. Inverse element of addition
  5. Compatibility of scalar multiplication with field multiplication \(a(b\mathbf v)=(ab)\mathbf v\)
  6. Identity element of scalar multiplication
  7. Distributivity of scalar multiplication w.r.t. vector addition
  8. Distributivity of scalar multiplication w.r.t. field addition

Basis of a vector space \((V, +, \cdot)\):

If there is no further structure to the vector space we can only define a subset \(B\in V\) called Hamel basis. Its conditions are:

  1. Every finite subset, \(\{b_1,\cdots,b_n\}\subset B\) is linearly independent.

  2. For every element \(\mathbf v\in V\) there exist \(v^1,\cdots,v^m\in k\) such that \(\mathbf v=\sum_{i=1}^m v^i\,b_i.\)

The dimension of the vector space is the cardinality of the basis.


A module is the equivalent of a vector space over a RING, as opposed to a FIELD.


A homomorphism is a structure-preserving linear map \(\large f\) between two algebraic structures of the same type (such as two groups, two rings, or two vector spaces). In the case of two vector spaces, each equipped with \(+\) and \(\cdot\) operations (\(V\) with \(\color{red}{+}\) and \(\color{red}{\cdot}\) and \(W\) with \(\color{orange}{+}\) and \(\color{orange}{\cdot}\)):

\[f: V \rightarrow W\] fulfills:

\[\forall v_1, v_2 \in V: f(v_1 \color{red}{+} v_2) = f(v_1) \color{orange}{+} f(v_2)\]

and

\[\forall \lambda \in k, v\in V: f(\lambda \color{red}{\cdot} v) = \lambda\color{orange}{\cdot}f(v).\]

A bijective linear map is called a vector space ISOMORPHISM.

Two vector spaces \(V\) and \(W\) are isomorphic \(V\cong W\) if \(\exists\) and isomorphism: \(f: V\rightarrow W.\)

An ENDOMORPHISM is a linear map of \(V\) onto itself. Hence, \(\text{End}(V) := \text{Hom}(V,V).\)

An AUTOMORPHISM is an invertible linear map of \(V\) on itself. Hence, \(\text{Aut}(V) := \{f: V \overset{\sim}\rightarrow V | \text{invertible}\}\)

Automorphisms are a subset \(\text{Aut}(V) \subset \text{End}(V)\) of endomorphisms.


We can define the set of ALL LINEAR MAPS from \(V \rightarrow W\) as \(\text{Hom}(V,W):= \{f:V \overset{\sim}\rightarrow W\}\) with the \(\sim\) denoting linear map.

Is this set of all linear maps a vector space? Yes, by defining,

\[\boxed{\color{blue}{+}}: \text{Hom}(V, W) \times \text{Hom}(V,W) \rightarrow \text{Hom}(V,W)\] by taking the pair of linear functions and mapping it to

\[(f,g)\mapsto f\;\boxed{+}\;g,\]

where \(f\;\boxed{+}\;g\) is again a map from \(V\) to \(W\) defined by \(v\mapsto f(v) \color{orange}{+} g(v)\)

and also defining

\[\boxed{\color{blue}{\cdot}}:(\lambda\,\boxed{\color{blue}{\cdot}}\,g) (V)= \lambda \color{orange}{\cdot} g(V)\]


\(V\) Star, Dual Vector Space or \(V^*:\)


\[\large V^* := \text{Hom}(V, k)\]

Here \(k\) is considered a vector space.

So \(V^*\) is a set of linear functionals from a vector space to the field \(k\), which is considered in this context just another vector space with addition and multiplication inherited from the field operations.


Multilinear maps:


So in general, maps can be more and more complicated, and for example, \[V\times V\times V^*\times V\times V^* \to \mathbb R\] would be the set of all possible tensor products of the form \[T_{\mu\;\nu}{}^{\gamma}{}_\rho{}^\eta\quad e^\mu\otimes e^\nu\otimes e _\gamma\otimes e^\rho \otimes e_\eta\]

with \(T_{\mu\;\nu}{}^{\gamma}{}_\rho{}^\eta\) corresponding to the components of the tensor, which are the only part usually transcribed (the basis vectors are implicit). Hence, is important to keep the spaces and order of the sub- and supra-scripted Greek letters.

However, there is a system to the madness: The vectors in the tensor product come first by convention: e.g. \(V\otimes V^*\) as opposed to \(V^* \otimes V.\)

The rank of a tensor is similarly expressed as \((\text{number of vectors, number of covectors}),\) so that \(V\otimes V^* \otimes V^*,\) symbolizes the set of all possible tensors of rank \((1,2):\)

\[\begin{align} T^\mu{}_{\nu\lambda}\left[e_\mu \otimes e^\nu \otimes e^\lambda\right]\left(B_\eta \,e^\eta, A^\delta\,e_\delta,C^\gamma\,e_\gamma\right)&=\langle B_\eta\,e^\eta, e_\mu \rangle\,\langle e^\nu,A^\delta e_\delta \rangle\,\langle e^\lambda, C^\gamma e_\gamma\rangle\\[2ex] &=T^\mu{}_{\nu\lambda}\;B_\eta A^\delta C^\gamma \; \langle e^\eta, e_\mu \rangle\,\langle e^\nu, e_\delta \rangle\,\langle e^\lambda, e_\gamma\rangle\\[2ex] &=T^\mu{}_{\nu\lambda}\;B_\eta A^\delta C^\gamma \;\delta^\eta{}_\mu\; \delta^\nu{}_\delta\; \delta^\lambda{}_\gamma\\[2ex] &= T^\mu{}_{\nu\lambda}\;B_\mu\, A^\nu\, C^\lambda \end{align}\]

with the indexes in the last line of our choice.

A tensor \(A\) is a member of the \(T^3_2\) tensor product space, i.e. \(A\in T^3_2(v),\) if it is of the form \(A^{\alpha\beta\gamma}{}_{\mu\nu}\, \underbrace {e_\alpha\otimes e_\beta\otimes e_\gamma}_{\text{basis vecs }\in V}\otimes \underbrace{e^\mu\otimes e^\nu}_{\text{covecs }\in V^*},\) meaning that it would “eat” \(3\) covectors and \(2\) vectors to produce a real number: \(V^*\times V^* \times V^* \times V\times V\to \mathbb R.\) So, \(T^{\text{ no. vecs in }\otimes \text{ prod.}}_{\text{ no covecs in the }\otimes}\) or a \((3,2)\)-rank tensor.

Here is an example of the “eating” vectors and covectors by a tensor:

\[T^{\alpha\beta\gamma}{}_{\mu\nu}\, \Big [ e_\alpha \otimes e_\beta \otimes e_\gamma\otimes e^\mu\otimes e^\nu\Big ]\left(\underbrace{B_\eta e^\eta, C_\omega e^\omega, F_\epsilon e^\epsilon}_{\text{eats 3 covectors}},\underbrace{Z^\theta e_\theta, Y^\rho e_\rho}_{\text{eats 2 vectors}}\right)\to \mathbb R\]

NOTE on the difference between tensor and tensor product (from the comments by XylyXylyX):

A “tensor” is an element of a “tensor product space”, i.e. \(T^i_j\,(v).\) A tensor product space contains elements which are the “tensor product” of vectors and covectors. So a “tensor product” is a multi linear map built using the tensor product operator \((\otimes)\). So a tensor is the tensor product of some vectors and covectors. Keep in mind that a tensor product of rank \((1,0)\) or \((0,1)\) is not really “multi” linear, it is just “linear” but we still call it a tensor, so \(V\) and \(V^*\) are tensor product spaces but small ones. And rank \((0,0)\) tensor product spaces are just real numbers.

Tensors are multilinear maps, meaning that if we multiply by a scalar any of the entries in \(V\times V\times\cdots\times V^*\times V^*\times \cdots \to \mathbb R,\) keeping every other component constant, the result will be multiplied by the same scalar.


Bottom line definition:

\[\begin{align}&\left(\mathbf e_1\otimes\mathbf e_2\otimes\dots\otimes\mathbf e_p\otimes\mathbf e^1\otimes\mathbf e^2\otimes\dots\otimes\mathbf e^q\right) \left({\underbrace{\alpha_1, \alpha_2,\dots,\alpha_p,}_{\text{linear funcitonals }\in V^*}} \underbrace{ \vec v_1,\vec v_2, \dots, \vec v_q}_{\vec v_i\in V}\right)\\[2ex] &=\underbrace{\alpha_1(\mathbf e_1)\cdot \alpha_2(\mathbf e_2)\cdot\alpha_p(\mathbf e_p)\;}_{\text{row vec dotted col vec (basis of }V)\text{ and multiplied}}\quad\quad\underbrace{\mathbf e^1(\vec v_1)\cdot\mathbf e^2(\vec v_2)\cdots \mathbf e^q(\vec v_q)}_{\text{ row vecs (basis of V}^*)\text{ dotted with vecs and multiplied}} \end{align}\]

For instance, \(\mathbf e_1=\begin{bmatrix}1\\0\\\vdots\\0\end{bmatrix}\), and \(\mathbf e^1=\begin{bmatrix} 1 & 0 & \cdots &0\end{bmatrix}\). Hence, all we are doing is selecting components from either row vectors \((\alpha_1)\) or column vectors \((\vec v_i)\), and then multiplying them.

This would be a tensor space $T^p_{q}(v), $ mapping the Cartesian product of \(p\) elements of the dual, \(\underbrace{V^* \times V^* \times \cdots}_{p}\) and \(q\) elements of \(V\), i.e. \(\underbrace{\times V \times V\times\cdots}_q\) to \(\mathbb R.\)

Notice that the components of the vectors are given with superscripts - the opposite to their basis vectors - vectors are “contravariant”: the longer a component of the basis vectors, the less number of that component vector needed to express any given vector. The gradient would naturally move in the opposite direction. If the components are \(x^i\), the gradient is \(\partial x_i=\frac{\partial}{\partial x^i}.\)


The definition of the space of \((p,q)\) tensors as the set of multilinear maps from the Cartesian product of elements of a vector space and its dual onto the field, equipped with addition and s-multiplication rules, given in this series at this point in time as follows:

A \((p,q)\) tensor, \(T\) is a MULTILINEAR MAP that takes \(p\) copies of \(V^*\) and \(q\) copies of \(V\) and maps multilinearly (linear in each entry) to \(k:\)

\[\bbox[20px, border:2px solid red]{T: \underset{p}{\underbrace{V^*\times \cdots \times V^*}}\times \underset{q}{\underbrace{V\times \cdots \times V}} \overset{\sim}\rightarrow K}\tag1\]

The \((p,q)\) TENSOR SPACE (“the space of (p,q) tensors over a vector space V”) is defined as a set:

\[\bbox[20px, border:2px solid red]{T^p_q\,V = \underset{p}{\underbrace{V\color{darkorange}{\otimes}\cdots\color{darkorange}{\otimes} V}} \color{darkorange}{\otimes} \underset{q}{\underbrace{V^*\color{darkorange}{\otimes}\cdots\color{darkorange}{\otimes} V^*}}:=\{T\, |\, T\, \text{ is a (p,q) tensor}\}}\tag2\]

\[\bbox[20px, border:2px solid red]{\underbrace{=\{T: \underset{p}{\underbrace{V^*\times \cdots \times V^*}}\times \underset{q}{\underbrace{V\times \cdots \times V}} \overset{\sim}\rightarrow K\}}_{\text{the set of all maps from p covecs and q vecs into K}}}\tag3\]

This expression symbolizing the set of all tensors where \(T\) is \((p,q)\), equipped this with pointwise addition and s-multiplication.

For example, for two \((1,1)\) tensor, or \(T^1_1V\) and \(S^1_1\):

\[(T+S)(\underbrace{\sigma}_{\text{eat element of }V^*},\underbrace{\mathbf v}_{\text{eat element of }V}) :=T(\sigma,\mathbf v) \underbrace{+}_{\text{field}}S(\sigma,\mathbf v)\]

and

\[(\lambda T)(\sigma,\mathbf v) :=\lambda \underbrace{\cdot}_{\text{multip. in k}}T(\sigma, \mathbf v)\]

So this set of tensors, equipped with these two operations fulfills the requisites of a vector space \((T^p_qV, +, \cdot)\).

This is (not surprisingly) consistent with the Wikipedia definition of tensors as multilinear maps.

Example of tensor spaces:

  1. \(T^0_1 V \equiv V^*:=\{T: V\rightarrow k\}\). The whole tensor space is just the dual vector space, and the \(T^0_1 V\) is just a covector.

  2. \(T^1_1 V \equiv V \color{orange}{\otimes} V^*:= \{T:V^* \times V \rightarrow k\}\cong \text{End}(V^*)\)

This last statement implies a linear bijection. Proof:

Given \(T\in V\color{orange}{\otimes} V^*\) we can construct an element of \(\hat T \in \text{End}(V^*)\), such that \(\hat T: V^* \rightarrow V^*\), which means that for every \(\omega \in V^* \mapsto \underbrace{\underbrace{\color{red}{T(\underbrace{\cdot}_{\text{plug a vec here to get no.}}, \omega)}}_{V\rightarrow K}}_{\in V^*}.\) We can reconstruct the \(T^1_1\): \(T(\mathbf v,\omega):= \underbrace{\hat T(\omega)}_{\in V^*}(\mathbf v) = T(\mathbf v,\omega).\)

  1. \(T^1_0 \equiv V\) only in finite vector spaces.

  2. \(T^1_1 \equiv \text{End}(V)\) only in finite vector spaces.

  3. \(\left(V^*\right)^*\equiv V\) only in finite vector spaces.


Contructing new tensor out of old tensors:

The most important way is tensor product (\(\color{blue}{\otimes}\) different from the definition of the tensor space \(\color{orange}{\otimes}\)). It doesn’t take tensors of the same rank (as was the case with addition). It can take \(T\in T_q^p V\) and \(S\in T^r_s V\) and:

\[T\color{blue}{\otimes}S\in T_{q+s}^{p+r}V\]

defined as:

\[(T\color{blue}{\otimes}S)(\underbrace{ \omega_1,\cdots,\omega_q,\cdots,\omega_{q+s}, v_1,\cdots,v_p,\cdots,v_{p+r}}_\text{"eats"... "the diet"}):= T(\underbrace{\omega_1,\cdots,\omega_q, v_1,\cdots,v_p}_{\text{"eats up" p vecs and q covecs yielding a no.}})\underbrace{\cdot}_{\text{in k}}S(\underbrace{\omega_{q+1},\cdots,\omega_{q+s}, v_{p+1},\cdots,v_{p+r}}_{\text{"eats up" p vecs and q covecs yielding a no.}})\]


Components of a tensor:

We need a basis for a tensor spce in order to talk about the components of a tensor. So if we have \(T\in T^p_q V\) and \(V<\infty\) with \(\large\{e_1,\cdots,e_{\text{dim } V}\}\) being the basis of \(V\), and \(\large\{\epsilon^1,\cdots,\epsilon^{\text{dim} V=V^*}\}\) being the basis of \(V^*\).We define the components of this abstract thing that is the tensor (abstract in that it is simply a multilinear map) by first defining the components of the tensor space are more concrete: they are numbers, and depend on the basis of \(V\) and \(v^*\).

\[\color{green}{\large T^{\overbrace{a_1,\cdots,a_p}^{\text{numbers}}}_{\quad\quad\quad\quad\underbrace{b_1,\cdots,b_q}_{\text{numbers}}}}:= \underbrace{T(\epsilon^{a_1}, \epsilon^{a_2},\cdots,\epsilon^{a_p},e_{b_1},\cdots,e_{b_q})}_{\text{...yielding a number...}}\in K\tag4\]

The basis for \(V\) and \(V^*\) are arbritarily chosen so that the basis for \(V^*\) is induced by a previous choice of basis for \(V\), so that:

\[\Large\color{blue}{\epsilon^a(\underbrace{e_b}_{\text{eating a vec from basis V}})=\delta_b^a}\]

This is the dual basis of the dual space.


Once the basis of \(V\) and \(V^*\) are defined, and we have the tensor components we can we can RECONSTRUCT THE ABSTRACT TENSOR, \(T \in T^p_q\,V\)! This is the point!

How?


Reconstruction of \(T\) from its components:


\[T=\underbrace{\sum_{a_1=1}^{\text{dim v sp.}}\cdots\sum_{b_1=1}^{\text{dim v sp.}}}_{\text{p + q sums (usually omitted)}}\underbrace{\color{green}{T^{\overbrace{a_1,\cdots,a_p}^{\text{numbers}}}_{\quad\quad\quad\quad\underbrace{b_1,\cdots,b_q}_{\text{numbers}}}}}_{\text{a number}}\underbrace{\cdot}_{\text{S-multiplication}}\underbrace{e_{a_1}\color{blue}{\otimes}\cdots\color{blue}{\otimes}e_{a_p}\color{blue}{\otimes} \epsilon^{b_1}\color{blue}{\otimes}\cdots\color{blue}{\otimes}\epsilon^{b_q}}_{(p,q)\text{ tensor}}\]

Notice that it is the “blue” tensor product because we are not symbolizing the tensor space (as the set of all maps between the Cartesian product of vectors and covectors and \(K\)), which would call for the “orange” tensor. Instead this is a vector product (the \(e_i\) and \(\epsilon^i\) are vectors and covectors).

Why the sum signs are often omitted? Because the \(\color{green}{T^{a_1,\cdots,a_p}_{\quad\quad\quad\cdot\cdots}}\;e_{a_1}\color{blue}{\otimes}\cdots\color{blue}{\otimes}e_{a_p}\) combines up and down indices (Einstein notation), and the same happens in the other direction \(\color{green}{T^{\cdots\cdots}_{\quad\quad b_1,\cdots,b_q}}\;\epsilon^{b_1}\color{blue}{\otimes}\cdots\color{blue}{\otimes}\epsilon^{b_q}\).


Example:

Tensor (1,1) expressed as a map \(\phi: V^* \times\rightarrow K\)

We evaluate \(\phi\) by plugging in some covector \(\underbrace{\mathbf \omega}_{\in V^*}\) and some vector \(\underbrace{\mathbf v}_{\in V}\):

\[\begin{align}\phi(\mathbf \omega, \mathbf v)&\underbrace{=}_{\text{if we have basis}}\phi\left(\sum_{a=1}^d\omega_a \epsilon^a, \sum_{b=1}^d v^b\,e_b\right)\\[2ex] &\underbrace{=}_{\text{lin'ty of }\phi}\sum_{a=1}^d\sum_{b=1}^d\left(\omega_a\epsilon^a, v^b\,e_b\right)\\[2ex] &= \sum_{a=1}^d\sum_{b=1}^d\; \underbrace{\omega_a\,v^b}_{\text{comp'ts cov and vec}}\; \underbrace{\underbrace{\phi\left(\epsilon^a\,e_b\right)}_{\text{comp'ts tensor }\phi^a_b\text{ Eq(4)}}}_{\text{this is a matrix}} \end{align}\]

We can write

\[\phi=\phi^{a=\text{rows}}_{b=\text{cols}}\left(\epsilon^a\,e_b\right)\in \text{End}(V)\equiv T^1_1\]

and

\[\phi=\begin{bmatrix} \phi^1_1&\phi^1_2&\cdots&\phi^1_d\\ \phi^2_1\\ \vdots\\ \phi_1^d&\cdots&\cdots&\phi^d_d \end{bmatrix}\]

Here’s a connection with matrix multiplication: an covector (\(\mathbf \omega\)) acting on a vector(\(\mathbf v\)):

\[\begin{align}\large\omega(v)&=\omega_a\,\epsilon^a(v^b\,e_b)\\[2ex] &= \omega_a\,v^b\,\underbrace{\epsilon^a\,(e_b)}_{\delta^a_b}\\[2ex] &=\omega_m\underbrace{\cdot}_{\text{matrix product}} v^m \end{align}\]

We could think of \(\omega_m\,v^m\) as \(\omega^\top \cdot v\) with

\(\mathbf \omega =\begin{bmatrix}\omega_1\\\vdots\\\omega_d\end{bmatrix}\) and \(\mathbf v =\begin{bmatrix}v_1\\\vdots\\v_d\end{bmatrix}\), but here we are subject to a given coordinate system.


Change of basis from \(e_i\) to \(\tilde e_i\):

\[\tilde e_a =\sum_{b=1}^d\, \underbrace{A^b_a}_{\text{in K}}\underbrace{\cdot}_{\text{S-multiplication}}\,e_b\]

and

\[ e_a =\sum_{b=1}^d\, \underbrace{B^b_a}_{\text{in K}}\underbrace{\cdot}_{\text{S-multiplication}}\,\tilde e_b\]

in which case \(B=A^{-1}\).


This answer in MathSE expains this notation:

Let’s first look at a very special type of tensor, namely the (0,1) tensor. What is it? Well, it is the tensor product of \(0\) copies of members of \(V\) and one copy of members of \(V^*\). That is, it is a member of \(V^*\).

But what is a member of \(V^*\)? Well, by the very definition of \(V^*\) is is a linear function \(\phi:V\to K\). Let’s write this explicitly: \[T^0_1V = V^* = \{\phi:V\to K|\phi \text{ is linear}\}\] You see, already at this point, where we didn’t even use a tensor product, we get a \(V^*\) on one side, and a \(V\) on the other, simply by inserting the definition of \(V^*\).

From this, it is obvious why \((0,q)\)-tensors have \(q\) copies of \(V^*\) in the tensor product \((2)\), but \(q\) copies of \(V\) in the domain of the multilinear function in \((3)\).

OK, but why do you have a \(V^*\) in the map in \((3)\) for each factor \(V\) in the tensor product? After all, vectors are not functions, are they?

Well, in some sense they are: There is a natural linear map from \(V\) to its double dual \(V^{**}\), that is, the set of linear functions from \(V^*\) to \(K\). Indeed, for finite dimensional vector spaces, you even have that \(V^{**} \cong V\). This natural map is defined by the condition that applying the image of \(v\) to \(\phi\in V^*\) gives the same value as applying \(\phi\) to \(v\). I suspect that the lecture assumes finite dimensional vector spaces. In that case, you can identify \(V\) with \(V^{**}\), and therefore you get \[T^1_0V = V = V^{**} = \{T:V^*\to K|T \text{ is linear}\}\] Here the second equality is exactly that identification.

Now again it should be obvious why \(p\) copies of \(V\) in the tensor product \((2)\) give \(p\) factors of \(V^*\) for the domain of the multilinear functions in \((3)\).

On the relations of those terms to the Kronecker product:

The tensor product \(\color{darkorange}{\otimes}\) in \((2)\) is a tensor product not of (co)vectors, but of (co)vector spaces. The result of that tensor product describes not one tensor, but the set of all tensors of a given type. The tensors are then elements of the corresponding set. And given a basis of \(V\), the tensors can then be specified by giving their coefficients in that basis.

This is completely analogous to the vector space itself. We have the vector space, \(V\), this vector space contains vectors \(v\in V\), and given a basis \(\{e_i\}\) of \(V\), we can write the vector in components, \(v = \sum_i v^i e_i\).

Similarly for \(V^*\), we can write each member \(\phi\in V^*\) in the dual basis \(\omega^i\) (defined by \(\omega^i(e_j)=\delta^i_j\)) as \(\sum_i \phi_i \omega^i\). An alternative way to get the components \(\phi_i\) is to notice that \(\phi(e_k) = \sum_i \phi_i \omega^i(e_k) = \sum_i \phi_i \delta^i_k = \phi_k\). That is, the components of the covector are just the function values at the basis vectors.

This way one also sees immediately that \(\phi(v) = \sum_i \phi(v^i e_i) = \sum_i v^i\phi(e_i) = \sum_i v^i \phi_i\), which is sort of like an inner product, but not exactly, because it behaves differently at change of basis.

Now let’s look at a \((0,2)\) tensor (a tensor with \(0\) vectors and \(2\) covectors), that is, a bilinear function \(f:V\times V\to K\) - I tentatively would look like at it as \(\left(<\cdot,f>,<\cdot,f>\right).\) Note that \(f\in V^*\color{darkorange}{\otimes} V^*\), because \(V^*\color{darkorange}{\otimes} V^*\) is by definition the set of all such functions (see eq. \((3)\)). Now by being a bilinear function, one again only needs to know the values at the basis vectors. In the following equation \(v\) and \(w\) are two arbritary vectors, and \(f\) is the tensor with components \(f_{ij}\). The \(e_i\) and \(e_j\) are the basis vectors of \(V\), so that \(v = \sum_i v^i\, e_i\) and \(w = \sum_j\,w^j\,e_j:\)

\[\begin{align}f(v,w) &= f(\sum_i v^i e_i, \sum_j w^j e_j)\\[2ex] &= \sum_{i,j}v^i w^j f(e_i,e_j)\\[2ex] &\underset{*}{=} \sum_{i,j}\,f_{ij}\,v^i\,w^j \in K \end{align}\]

\(*\) we can define as components \(f_{ij} = f(e_i,e_j)\) and get \(f(v,w)=\sum_{i,j}f_{ij}v^iw^j\).

This goes also for general tensors: A single tensor \(T\in T^p_qV\) (\(p\) vectors and \(q\) covectors) is a multilinear function \(T:(V^*)^p\times V^q\to K\), and it is completely determined by the values you get when inserting basis vectors and basis covectors everywhere, giving the components (note that components are numbers!)

\[T^{i\ldots j}_{k\ldots l}=T(\underbrace{\omega^i,\ldots,\omega^j}_{p},\underbrace{e_k,\ldots,e_l}_{q})\in K\]

OK, we now have components, but we have still not defined the tensor product of tensors. This is a way to generate new tensors from old tensors. Another way is to add tensors, which is a map from the Cartesian product \(T^p_q\,V\,\times\,T^p_q\,V\rightarrow T^p_q\,V.\) Notice that the valence of the tensor does not change.

Tensor product:

Be \(x\in T^p_qV\), and \(y\in T^r_sV\). That is, \(x\) is a function that takes \(p\) covectors and \(q\) vectors, and gives a scalar, while \(y\) takes \(r\) covectors and \(s\) vectors to a scalar. Then the tensor product \(x\color{blue}{\otimes} y\) is a function that takes \(p+r\) covectors and \(q+s\) vectors, feeds the first \(p\) covectors and the first \(q\) vectors to \(x\), and the remaining \(r\) covectors and \(s\) vectors to \(y\), and them multiplies the result. That is, \[(x\color{blue}{\otimes} y)(\phi_1,\ldots,\phi_{p+r},v_1,\ldots,v_{q+s}) = x(\phi_1,\ldots,\phi_p,v_1,\ldots,v_q)\underset{\text{scalar}}{\cdot} y(\phi_{p+1},\ldots,\phi_{p+r},v_{q+1},\ldots,v_{q+s})\] It is not hard to check that this function is indeed also multilinear, and therefore \(x\color{blue}{\otimes} y\in T^{p+r}_{q+s}V\).

And now finally, we get to the question what the components of \(x\color{blue}{\otimes} y\) are. Well, the components of \(x\color{blue}{\otimes} y\) are just the function values when inserting basis vectors and basis covectors, and when you do that and use the definition of the tensor product, you find that indeed, the components of the tensor product are the Kronecker product of the components of the factors.

Also, it can be shown that \(T^p_qV\) is a vector space in its own right, and therefore the \((p,q)\)-tensors can be written as the linear combination of a basis that is \(1\) exactly for one combination of basis vectors and basis covectors and \(0\) for all other combinations. However it can then easily be seen that this is just the tensor product of the corresponding dual covectors/vectors. Since furthermore in that basis, the coefficients on the basis vectors are just the components of the tensor as introduced before, we finally arrive at the formula \[T = \sum T^{i\ldots j}_{k\ldots l}\underbrace{e_i\color{blue}{\otimes}\dots\color{blue}{\otimes} e_j}_{p}\color{blue}{\otimes}\underbrace{\omega^k\color{blue}{\otimes}\dots,\color{blue}{\otimes}\;\omega^l}_{q}\]


Home Page

NOTE: These are tentative notes on different topics for personal use - expect mistakes and misunderstandings.