ATTEMPT TO ORGANIZE THIS MATERIAL (now validated by Prof. Shifrin - at least in part; errors, mine):

In the Wikipedia example of the tensor product vector spaces, included in my OP, as well as in my previous post here, the tensor product is of the form \(V\otimes V,\) a \((0,2)\) tensor, and results in a form akin to \((1)\) in the OP:

\[A^0 B^0 e_0 \otimes e_0 + A^0 B^1 e_0 \otimes e_1 + \cdots + A^4 B^4 e_4 \otimes e_4\]

equivalent to an outer product, as illustrated in this post:

The tensor product of two vectors \(v\in V\) and \(w \in W\), i.e. \((V\otimes W)\) is akin to calculating the outer product of two vectors:

\[\large v\otimes_o w=\small \begin{bmatrix}-2.3\;e_1\\+1.9\;e_2\\-0.5\;e_3\end{bmatrix}\begin{bmatrix}0.7\;e_1&-0.3\;e_2&0.1\;e_3\end{bmatrix}= \begin{bmatrix}-1.61\;e_1\otimes e_1&+0.69\;e_1\otimes e_2&-0.23\;e_1\otimes e_3\\+1.33\;e_2 \otimes e_1&-0.57\;e_2 \otimes e_2&+0.19\;e_2 \otimes e_3\\-0.35\;e_3 \otimes e_1&+0.15\;e_3 \otimes e_2&-0.05\;e_3 \otimes e_3\end{bmatrix}\]

This is equivalent to the tensor product space \(V^*\otimes V^*\) (the set of all tensor \((2,0))\) on the slide in the OP. The presenter is tensor-multiplying two co-vectors in the vector basis of \(V^*\), without coefficients, yielding the \(16\) pairs of basis vectors of \(V^*\otimes V^*\): \[\{e^0\otimes e^0, \; e^0\otimes e^1, \; e^0\otimes e^3, \;\cdots, e^4\otimes e^4\}.\]

The key is to distinguish these forms of tensor product of vector spaces from their application to other vectors (or covectors), i.e. when the \[\langle\beta_\mu\,e^\mu\;,\;A^\nu\,e_\nu \rangle\;=\beta_\mu\,A^\nu\,\langle e^\mu\;,\;e_\nu\rangle \;=\beta_\mu\,A^\nu\,\delta^\mu_{\;\nu}\;=\beta_\mu\,A^\mu\;\in \mathbb R\] operations are carried out, yielding a real number - which is what is explained in the video.

These linear mappings \(\beta\otimes\gamma:V\times V \to \mathbb R\) properly interpreted as \([\beta\otimes\gamma](v,w)=\langle \beta,v\rangle\langle\gamma,w\rangle\) (i.e. the tensor \(\beta\otimes\gamma\) acting on two vectors, \(v\) and \(w\)) would correct the \((2)\) part of the OP (after Professor Shifrin’s answer) as:

\(\begin{align} &(\beta\otimes\gamma)\left(\sum A^\mu e_\mu,\sum B^\nu e_\nu\right)= \\[2ex] &=\left [ \beta_0\gamma_0\;e^0\otimes e^0+ \; \beta_0\gamma_1\;e^0\otimes e^1+ \;\beta_0\gamma_3\; e^0\otimes e^3+\cdots+ \;\beta_4\gamma_4\; e^4\otimes e^4 \right]\,\small{\left(\sum A^\mu e_\mu,\sum B^\nu e_\nu\right) } \\[2ex] &= \beta_0\gamma_0 A^\mu B^\nu \langle e^0,e_\mu \rangle \; \langle e^0,e_\nu \rangle \; + \; \beta_0\gamma_1 A^\mu B^\nu \langle e^0,e_\mu \rangle \; \langle e^1,e_\nu \rangle +\cdots +\beta_4\gamma_4 A^\mu B^\nu \langle e^4,e_\mu \rangle \; \langle e^4,e_\nu \rangle \\[2ex] &=\beta_0\gamma_0 A^\mu B^\nu \; \delta^0_{\;\mu}\; \delta^0_{\;\nu} \; + \; \beta_0\gamma_1 A^\mu B^\nu \; \delta^0_{\;\mu}\; \delta^1_{\;\nu} +\cdots +\beta_4\gamma_4 A^\mu B^\nu \; \delta^4_{\;\mu}\; \delta^4_{\;\nu} \\[2ex]&= \sum \beta_\mu\gamma_\nu A^\mu B^\nu \end{align}\)

indeed a real number, exemplifying the mapping \(V\times V \to \mathbb R.\) The tensor is defined as

\[\begin{align}\beta\otimes \gamma&:= \beta_0\gamma_0\, e^0\otimes e^0+\beta_0\gamma_1\, e^0\otimes e^1 + \beta_0\gamma_2\, e^0\otimes e^2+\cdots+\beta_3\gamma_3\, e^3\otimes e^3\\[2ex] &=T_{00}\, e^0\otimes e^0+T_{01}\, e^0\otimes e^1 + T_{02}\, e^0\otimes e^2+\cdots+T_{33}\, e^3\otimes e^3\\[2ex] &= T_{\mu\nu}\,e^\mu\otimes\,e^\nu \end{align}\]

As an example, I believe we could illustrate this as follows:

\(\beta \in V^*\) is \(\beta=\color{blue}{\begin{bmatrix}\sqrt{\pi} & \sqrt[3]{\pi} &\sqrt[5]{\pi} \end{bmatrix}}\) and \(\gamma\in V^*\) is \(\gamma=\color{red}{\begin{bmatrix}\frac{1}{3} &\frac{1}{5} &\frac{1}{7} \end{bmatrix}}\). The \((2,0)\)-tensor \(\beta\otimes \gamma\) is the outer product:

\[\begin{align}\beta\otimes_o \gamma&= \begin{bmatrix}\color{blue}{\sqrt\pi}\times \color{red}{\frac{1}{3}}\quad e^1\otimes e^1 &\color{blue}{\sqrt\pi}\times\color{red}{\frac{1}{5}}\quad e^1\otimes e^2 &\color{blue}{\sqrt\pi}\times\color{red}{\frac{1}{7}}\quad e^1\otimes e^3\\ \color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{3}}\quad e^2\otimes e^1 &\color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{5}}\quad e^2\otimes e^2 &\color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{7}}\quad e^2\otimes e^3 \\\color{blue}{\sqrt[5]{\pi}}\times\color{red}{\frac{1}{3}}\quad e^3\otimes e^1 &\color{blue}{\sqrt[5]{\pi}}\times\color{red}{\frac{1}{5}}\quad e^3\otimes e^2 &\color{blue}{\sqrt[5]{\pi}}\times \color{red}{\frac{1}{7}}\quad e^3\otimes e^3\end{bmatrix}\\[2ex] &=\begin{bmatrix}\color{red}{\frac{1}{3}}\color{blue}{\sqrt\pi}\quad e^1\otimes e^1&\color{red}{\frac{1}{5}}\color{blue}{\sqrt\pi}\quad e^1\otimes e^2&\color{red}{\frac{1}{7}}\color{blue}{\sqrt\pi}\quad e^1\otimes e^3\\\color{red}{\frac{1}{3}}\color{blue}{\sqrt[3]{\pi}}\quad e^2\otimes e^1&\color{red}{\frac{1}{5}}\color{blue}{\sqrt[3]{\pi}}\quad e^2\otimes e^2&\color{red}{\frac{1}{7}}\color{blue}{\sqrt[3]{\pi}}\quad e^2\otimes e^3\\\color{red}{\frac{1}{3}}\color{blue}{\sqrt[5]{\pi}}\quad e^3\otimes e^1&\color{red}{\frac{1}{5}}\color{blue}{\sqrt[5]{\pi}}\quad e^3\otimes e^2&\color{red}{\frac{1}{7}} \color{blue}{\sqrt[5]{\pi}}\quad e^3\otimes e^3\end{bmatrix} \end{align}\]

Now if we apply this tensor product on the vectors

\[v=\color{magenta}{\begin{bmatrix}1\\7\\5\end{bmatrix}}, w = \color{orange}{\begin{bmatrix}2\\0\\3\end{bmatrix}}\]

\[\begin{align} (\beta \otimes \gamma)[v,w]=&\\[2ex] & \;\color{blue}{\sqrt\pi}\times \color{red}{\frac{1}{3}} \times \color{magenta} 1 \times \color{orange}2 \quad+\quad \color{blue}{\sqrt\pi}\times\color{red}{\frac{1}{5}} \times \color{magenta}1 \times \color{orange} 0 \quad+\quad \color{blue}{\sqrt\pi}\times\,\color{red}{\frac{1}{7}} \times \color{magenta}1 \times \color{orange}3 \\ + &\;\color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{3}} \times \color{magenta}{7} \times \color{orange}2 \quad+\quad \color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{5}} \times \color{magenta}{7} \times \color{orange}0 \quad+\quad \color{blue}{\sqrt[3]{\pi}}\times\color{red}{\frac{1}{7}} \times \color{magenta}{7} \times \color{orange}3 \\ \;+ &\;\color{blue}{\sqrt[5]{\pi}}\times\color{red}{\frac{1}{3}} \times \color{magenta} 5 \times \color{orange}2 \quad+\quad \color{blue}{\sqrt[5]{\pi}}\times\color{red}{\frac{1}{5}} \times \color{magenta} 5 \times \color{orange}0 \quad+\quad \color{blue}{\sqrt[5]{\pi}}\times \color{red}{\frac{1}{7}} \times \color{magenta}5 \times \color{orange}3 \\[2ex] =&\\ & \color{blue}{\sqrt{\pi}}\;\times\color{magenta} 1 \quad\left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange} 0 \quad+\quad \color{red}{\frac{1}{7}} \times \color{orange}3\right) \\ + &\,\color{blue}{\sqrt[3]\pi} \times \color{magenta}{7}\quad\left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange}0 \quad+\quad \color{red}{\frac{1}{7}} \times \color{orange}3\right) \\ \;+ &\,\color{blue}{\sqrt[5]{\pi}}\times \color{magenta} 5\quad\left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange}0 \quad+\quad \color{red}{\frac{1}{7}} \times \color{orange}3 \right)\\[2ex] =&\\&\small \left(\color{blue}{\sqrt\pi} \times \color{magenta} 1 \quad+\quad \color{blue}{\sqrt[3]\pi} \times \color{magenta}{7} \quad +\quad \color{blue}{\sqrt[5]\pi} \times \color{magenta}5 \right) \times \left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange} 0 \quad +\quad \color{red}{\frac{1}{7}} \times \color{orange} 3 \right)\\[2ex] =&\\[2ex]&\langle \color{blue}\beta,\color{magenta}v \rangle \times \langle \color{red}\gamma,\color{orange}w \rangle\\[2ex] =& 20.05487\end{align}\]

The elements of the first vector, \(v,\) multiply separate rows of the outer product \(\beta \otimes_o \gamma,\) while the elements of the second vector \(w\) multiply separate columns. Hence, the operation is not commutative.

Here is the idea with R code:

> v = c(1,7,5); w = c(2,0,3); beta=c(pi^(1/2),pi^(1/3),pi^(1/5)); gamma = c(1/3,1/5,1/7)
> sum(((beta %o% gamma) * v) %*% w) # same as sum((beta %*% t(gamma) * v) %*% w)
[1] 20.05487
> sum(((beta %o% gamma) * w) %*% v) # not a commutative operation:
[1] 17.90857

Or more simply, \(\vec \beta \cdot \vec v \times \vec \gamma \cdot \vec w = 308\)

\[\begin{align} (\beta \otimes \gamma)[v,w]&=\langle \beta,v \rangle \times \langle \gamma,w \rangle\\[2ex] & =\small \left(\color{blue}{\sqrt\pi} \times \color{magenta} 1 \quad+\quad \color{blue}{\sqrt[3]\pi} \times \color{magenta}{7} \quad +\quad \color{blue}{\sqrt[5]\pi} \times \color{magenta}5 \right) \times \left(\color{red}{\frac{1}{3}} \times \color{orange}2 \quad+\quad \color{red}{\frac{1}{5}} \times \color{orange} 0 \quad +\quad \color{red}{\frac{1}{7}} \times \color{orange} 3 \right) \\[2ex] &=18.31097\times 1.095238\\[2ex] &= 20.05487\end{align}\]

> v = c(1,7,5); w = c(2,0,3); beta=c(pi^(1/2),pi^(1/3),pi^(1/5)); gamma = c(1/3,1/5,1/7)
> beta %*% v * gamma %*% w
[1,] 20.05487

Does it obey bilinearity?

\[(\beta\otimes \gamma)[v,w]\overset{?}=(\beta\otimes \gamma)\Bigg[\left(\frac{1}{5}v\right),\left(5\,w\right)\Bigg] \]

> v_prime = 1/5 * v
> w_prime = 5 * w
> beta %*% v_prime * gamma %*% w_prime
[1,] 20.05487   #Check!

\[(\beta\otimes \gamma)[v, u + w]\overset{?}=(\beta\otimes \gamma)[v,u] + (\beta\otimes \gamma)[v,w] \]

> u = c(-2, 5, 9)    # Introducing a new vector...
> beta %*% v * gamma %*% (u + w)
[1,] 49.7012
> (beta %*% v * gamma %*% u) + (beta %*% v * gamma %*% w)
[1,] 49.7012 #... And check!

But the evaluation of the vectors is not commutative:

    v = c(1,7,5); w = c(2,0,3); beta=c(pi^(1/2),pi^(1/3),pi^(1/5)); gamma = c(1/3,1/5,1/7)
    beta %*% w * gamma %*% v
##          [,1]
## [1,] 17.90857
    beta %*% v * gamma %*% w
##          [,1]
## [1,] 20.05487

So in general, maps can be more and more complicated, and for example, \[V\times V\times V^*\times V\times V^* \to \mathbb R\] would be the set of all possible tensor products of the form \[T_{\mu\;\nu}{}^{\gamma}{}_\rho{}^\eta\quad e^\mu\otimes e^\nu\otimes e _\gamma\otimes e^\rho \otimes e_\eta\]

with \(T_{\mu\;\nu}{}^{\gamma}{}_\rho{}^\eta\) corresponding to the components of the tensor, which are the only part usually transcribed (the basis vectors are implicit). Hence, is important to keep the spaces and order of the sub- and supra-scripted Greek letters.

However, there is a system to the madness: The vectors in the tensor product come first by convention: e.g. \(V\otimes V^*\) as opposed to \(V^* \otimes V.\)

The rank of a tensor is similarly expressed as \((\text{number of vectors, number of covectors}),\) so that \(V\otimes V^* \otimes V^*,\) symbolizes the set of all possible tensors of rank \((1,2):\)

\[\begin{align} T^\mu{}_{\nu\lambda}\left[e_\mu \otimes e^\nu \otimes e^\lambda\right]\left(B_\eta \,e^\eta, A^\delta\,e_\delta,C^\gamma\,e_\gamma\right)&=\langle B_\eta\,e^\eta, e_\mu \rangle\,\langle e^\nu,A^\delta e_\delta \rangle\,\langle e^\lambda, C^\gamma e_\gamma\rangle\\[2ex] &=T^\mu{}_{\nu\lambda}\;B_\eta A^\delta C^\gamma \; \langle e^\eta, e_\mu \rangle\,\langle e^\nu, e_\delta \rangle\,\langle e^\lambda, e_\gamma\rangle\\[2ex] &=T^\mu{}_{\nu\lambda}\;B_\eta A^\delta C^\gamma \;\delta^\eta{}_\mu\; \delta^\nu{}_\delta\; \delta^\lambda{}_\gamma\\[2ex] &= T^\mu{}_{\nu\lambda}\;B_\mu\, A^\nu\, C^\lambda \end{align}\]

with the indexes in the last line of our choice.

A tensor \(A\) is a member of the \(T^3_2\) tensor product space, i.e. \(A\in T^3_2(v),\) if it is of the form \(A^{\alpha\beta\gamma}{}_{\mu\nu}\, \underbrace {e_\alpha\otimes e_\beta\otimes e_\gamma}_{\text{basis vecs }\in V}\otimes \underbrace{e^\mu\otimes e^\nu}_{\text{covecs }\in V^*},\) meaning that it would “eat” \(3\) covectors and \(2\) vectors to produce a real number: \(V^*\times V^* \times V^* \times V\times V\to \mathbb R.\) So, \(T^{\text{ no. vecs in }\otimes \text{ prod.}}_{\text{ no covecs in the }\otimes}\) or a \((3,2)\)-rank tensor.

Here is an example of the “eating” vectors and covectors by a tensor:

\[T^{\alpha\beta\gamma}{}_{\mu\nu}\, \Big [ e_\alpha \otimes e_\beta \otimes e_\gamma\otimes e^\mu\otimes e^\nu\Big ]\left(\underbrace{B_\eta e^\eta, C_\omega e^\omega, F_\epsilon e^\epsilon}_{\text{eats 3 covectors}},\underbrace{Z^\theta e_\theta, Y^\rho e_\rho}_{\text{eats 2 vectors}}\right)\to \mathbb R\]

NOTE on the difference between tensor and tensor product (from the comments by XylyXylyX):

A “tensor” is an element of a “tensor product space”, i.e. \(T^i_j\,(v).\) A tensor product space contains elements which are the “tensor product” of vectors and covectors. So a “tensor product” is a multi linear map built using the tensor product operator \((\otimes)\). So a tensor is the tensor product of some vectors and covectors. Keep in mind that a tensor product of rank \((1,0)\) or \((0,1)\) is not really “multi” linear, it is just “linear” but we still call it a tensor, so \(V\) and \(V^*\) are tensor product spaces but small ones. And rank \((0,0)\) tensor product spaces are just real numbers.

Tensors are multilinear maps, meaning that if we multiply by a scalar any of the entries in \(V\times V\times\cdots\times V^*\times V^*\times \cdots \to \mathbb R,\) keeping every other component constant, the result will be multiplied by the same scalar.

Bottom line definition:

\[\begin{align}&\left(\mathbf e_1\otimes\mathbf e_2\otimes\dots\otimes\mathbf e_p\otimes\mathbf e^1\otimes\mathbf e^2\otimes\dots\otimes\mathbf e^q\right) \left({\underbrace{\alpha_1, \alpha_2,\dots,\alpha_p,}_{\text{linear funcitonals }\in V^*}} \underbrace{ \vec v_1,\vec v_2, \dots, \vec v_q}_{\vec v_i\in V}\right)\\[2ex] &=\underbrace{\alpha_1(\mathbf e_1)\cdot \alpha_2(\mathbf e_2)\cdot\alpha_p(\mathbf e_p)\;}_{\text{row vec dotted col vec (basis of }V)\text{ and multiplied}}\quad\quad\underbrace{\mathbf e^1(\vec v_1)\cdot\mathbf e^2(\vec v_2)\cdots \mathbf e^q(\vec v_q)}_{\text{ row vecs (basis of V}^*)\text{ dotted with vecs and multiplied}} \end{align}\]

For instance, \(\mathbf e_1=\begin{bmatrix}1\\0\\\vdots\\0\end{bmatrix}\), and \(\mathbf e^1=\begin{bmatrix} 1 & 0 & \cdots &0\end{bmatrix}\). Hence, all we are doing is selecting components from either row vectors \((\alpha_1)\) or column vectors \((\vec v_i)\), and then multiplying them.

This would be a tensor space $T^p_{q}(v), $ mapping the Cartesian product of \(p\) elements of the dual, \(\underbrace{V^* \times V^* \times \cdots}_{p}\) and \(q\) elements of \(V\), i.e. \(\underbrace{\times V \times V\times\cdots}_q\) to \(\mathbb R.\)

Notice that the components of the vectors are given with superscripts - the opposite to their basis vectors - vectors are “contravariant”: the longer a component of the basis vectors, the less number of that component vector needed to express any given vector. The gradient would naturally move in the opposite direction. If the components are \(x^i\), the gradient is \(\partial x_i=\frac{\partial}{\partial x^i}.\)


From this Quora answer:

Any vector space imbued with an inner product has a natural \(T^0_2=(\underbrace{0}_{\text{takes 0 cov's}},\underbrace{2}_{\text{takes 2 vec's}})\)-tensor sitting there: the inner product itself: it linearly takes in a pair of vectors and spits out their inner product, an element of the base field.

Similarly, any linear transformation of a vector space acts naturally as a \(T^1_1=(\underbrace{1}_{\text{takes 1 cov}},\underbrace{1}_{\text{takes 1 vec}})\)-tensor.

An example of a higher-order tensor is the determinant: given any linear transformation \(A\), from a vector space (of dimension \(n\)) to itself, \(\det(A)\) is a \(T^0_n=(\underbrace{0}_{\text{takes 0 cov's}},\underbrace{n}_{\text{takes n vec's}})\)-tensor: \(\det(A) (v_1,\cdots, v_N) = (A(v_1)) \land (A(v_2)) \land\cdots\land (A(v_N))\), where “\(\land\)” is the fully-antisymmetrized tensor product (the “wedge” product).

And of course as others have mentioned, differential topology and geometry are littered with tensors (and tensor fields / densities).

Practically the same, but more mathy:

\(\newcommand{\Reals}{\mathbf{R}}\newcommand{\Basis}{\mathbf{e}}\newcommand{\Brak}[1]{\left\langle #1\right\rangle}\)Let \((\Basis_{j})_{j=1}^{n}\) denote the standard basis of \(V = \Reals^{n}\) and let \((\Basis^{i})_{i=1}^{n}\) be the dual basis of \(V^{*} = (\Reals^{n})^{*}\). (Where possible below, I’ve taken case to use the dummy indices \(i\) and \(j\) “globally”.)