NOTES ON STATISTICS, PROBABILITY and MATHEMATICS


Covariant and Contravariant Tensors:


A concrete explanation can be found here, and it proceeds as follows:

We have two inertial (no acceleration) coordinate systems, \(S\) and \(S'\), where \(S'\) moves away from \(S\) at constant velocity \(v\) along the \(x\) axis - for now we assume no rotation or movement along the other two axes, \(y\) and \(z\).

An event \(P\) happens and it is recorded in \(S\) as \((ct, x, y, z)\). Why \(ct\)? To express time in the same units as the space coordintes we multiply the velocity of light by the observation of time. Not important.

The same event \(P\) will be expressed in relation to the \(S'\) coordinates as \(P=(ct', x', y',z').\)


Definition: CONTRAVARIANT VECTOR:


\(x^\mu = (x^0,x^1,x^2,x^3)\), such that

\(x^0= ct, \quad x^1=x, \quad x^2=y, \quad x^3=z\)

So \(\mu\) is an index, \(\mu= 0,1,2,3.\)


Definition: COVARIANT VECTOR:


\(x_\mu=(x_0,x_1,x_2,x_3)\), such that

\(x_0=ct, \quad x_1=\color{red}{-}x, \quad\color{red}{-}y \quad\color{red}{-}z\)

Why do we need this?

The contravariant and covariant vectors of the same event in relation to the frame \(S'\) are:

\(x'^\mu=(x^{'0}, x^{'1},x^{'2},x^{'3})\quad \text{contravariant}\)

and

\(x'_\mu=(x'_0, x'_1,x'_2,x'_3)\quad\text{covariant}\)


Now we have to bring in the Lorentz transformations for the answer to make sense. They are the equations that change coordinates between \(S\) and \(S'\) taking into consideration time dilation and length contraction (along the \(x\) axis only, since the frames are only moving with respect to each other along \(x\)):

The transformation of time from \(t\) in frame \(S\) to \(t'\) in \(S'\) is given by \(t' = \gamma \left(t - \frac{v}{c^2}x\right)\); while the transformation of \(x\) from \(S\) to \(S'\) obeys the equation \(x' =\gamma(x - vt).\) This is explained here, and adopting the notation of contravariant and covariant vectors, and the \(ct\) expression of \(t\):

\(x'^0 = \gamma\left(x^0 - \frac{v}{c} x^1\right)\)

\(x'^1 = \gamma\left(x^1 - \frac{v}{c} x^0\right)\)

\(x'^2 = x^2\)

\(x'^3 = x^3\)

Here, \(\large\gamma=\frac{1}{\sqrt{1-\frac{v^2}{c^2}}}\). Notice that the third and fourth dimension do not change in this simple model with two frames, one moving along the \(x\) axis of the other at constant velocity.

We can express the change in frames in matrix form as:

\[\begin{bmatrix}x'^0\\x'^1\\x'^2\\x'^3\end{bmatrix}=\begin{bmatrix}\gamma&-\gamma\beta&0&0\\-\gamma\beta&\gamma&0&0\\0&0&1&0\\0&0&0&1\end{bmatrix}\begin{bmatrix}x^0\\x^1\\x^2\\x^3\end{bmatrix}\]

with \(\beta=\frac{v}{c}.\)

If there is motion of \(S'\) in more than just the \(x\) difrection, the matriceal formula above can be generalized as:

\[\begin{bmatrix}x'^0\\x'^1\\x'^2\\x'^3\end{bmatrix}=\begin{bmatrix}\Lambda^0_{\; 0}&\Lambda^0_{\; 1}&\Lambda^0_{\; 2}&\Lambda^0_{\; 3}\\\Lambda^1_{\; 0}&\Lambda^1_{\; 1}&\Lambda^1_{\; 2}&\Lambda^1_{\; 3}\\\Lambda^2_{\; 0}&\Lambda^2_{\; 1}&\Lambda^2_{\; 2}&\Lambda^2_{\; 3}\\\Lambda^3_{\; 0}&\Lambda^3_{\; 1}&\Lambda^3_{\; 2}&\Lambda^3_{\; 3}\end{bmatrix}\begin{bmatrix}x^0\\x^1\\x^2\\x^3\end{bmatrix}\]

This can be express more succintly as:

\[\color{red}{x'^{\mu} = \Lambda^\mu_{\; \nu}x^\nu}\]

The same can be done with the covariant vector transformation:

\[\begin{bmatrix}x'_0\\x'_1\\x'_2\\x'_3\end{bmatrix}=\begin{bmatrix}\Lambda_0^{\; 0}&\Lambda_0^{\; 1}&\Lambda_0^{\; 2}&\Lambda_0^{\; 3}\\\Lambda_1^{\; 0}&\Lambda_1^{\; 1}&\Lambda_1^{\; 2}&\Lambda_1^{\; 3}\\\Lambda_2^{\; 0}&\Lambda_2^{\; 1}&\Lambda_2^{\; 2}&\Lambda_2^{\; 3}\\\Lambda_3^{\; 0}&\Lambda_3^{\; 1}&\Lambda_3^{\; 2}&\Lambda_3^{\; 3}\end{bmatrix}\begin{bmatrix}x_0\\x_1\\x_2\\x-3\end{bmatrix}\]

or

\[\color{blue}{x'_{\mu} = \Lambda_\mu^{\; \nu}x_\nu}\]

The Lorentz transformation \(\Lambda\) is defined so that \(x'^{\mu}x'_{\mu} = x^\mu x_mu\) (i.e. \(\small \text{contravariant in S'} \times \text{covariant in S'}=\text{contravariant in S} \times \text{covariant in S}\)).

We can express it as:

\[\Large \color{red}{x'^{\mu}}\,\color{blue}{x'_{\mu}} = x^\mu \,x_\mu= \color{red}{\Lambda^\mu_{\;\;\nu}\,x^\nu}\,\color{blue}{\Lambda_\mu^{\;\;\rho}\,x_\rho}= \Lambda^{\mu}_{\;\;\nu}\,\Lambda_\mu^{\;\;\rho}\,\,x^\nu \,x_\rho=\delta^\rho_{\;\;\nu}\,x^\nu\,x_\rho\]

Why does it work? Well, when \(\rho=\nu=\mu\) the Kronecker product is one, and the final part of the equation above will be \(x^\mu\,x_\mu,\) fulfilling the requisite of \(x'^{\mu}x'_{\mu} = x^\mu x_\mu.\)

Leading to the important result:

\[\large\color{orange}{\Lambda^\mu_{\;\;\nu}\,\Lambda_\mu^{\;\;\rho}=\delta^\rho_{\;\;\nu}}\]


OK. So now we have covariant and contravariant vectors expressing the event with respect to \(S\) and \(S'\)… and Lorentz transformations… we can do this…

\[\large x^\mu x_\mu= x^0 x_0 + x^1 x_1 + x^2 x_2 + x^3 x_3 = c^2t^2 - \left(x^2 + y^2 + z^2 \right)=\color{red}{c^2t^2 - \vec{x}\vec{x}}\]

\(\vec{x}\vec{x}\) is the norm! And the result is a scalar (field value).

How to express this operation generally? We use the Einstein summation convention:

\[A^\mu B_\mu=A^0B_0+A^1B_1+A^2B_2+A^3B_3\]

where \(A^\mu =(A^0,A^1,A^2,A^3)\) has a covariant vector \(A_mu=(A_0,A_1,A_2,A_3) = (A_0,-A^1,-A^2,-A^3).\) And so does \(B^\mu.\)


Tensors:



Change of Coordinates between Cartesian and Oblique:

From this video.

From this introductory differential geometry series online ushering in the concepts of contravariant (\(A^i\)) and covariant (\(A_i\)) of a vector as a change of coordinates:

In the setting of oblique coordinates, \(y\) axis is tilted, but the \(x\) axis stays unchanged:

The contravariant is calculated as:

Assigning to the pink segment the symbol \(*\):

\(\text{tan}(\alpha) = \frac{b}{*}\); and therefore, \(* = \frac{b}{\text{tan}(\alpha)}\)

and \(c^u = a - * = a - \frac{b}{\text{tan}(\alpha)}\)

while \(c^v = \frac{b}{\text{sin}(\alpha)}\)

\[H = \begin{bmatrix} 1 & - \frac{1}{\text{tan}(\alpha)} \\ 0 & \frac{1}{\text{sin}(\alpha)} \end{bmatrix}\]

Therefore,

\[\begin{bmatrix} 1 & - \frac{1}{\text{tan}(\alpha)} \\ 0 & \frac{1}{\text{sin}(\alpha)} \end{bmatrix} \begin{bmatrix} c_x \\ c_y \end{bmatrix}= \begin{bmatrix} c^u \\ c^v \end{bmatrix} = \begin{bmatrix} c^1 \\ c^2 \end{bmatrix} \]

and


On the other hand, the covariant is calculated as:

where,

\(c_u = a\) and

\(\bf c_v = a\,\text{cos}(\alpha) + b\, \text{sin}(\alpha)\)


This expression is derived as follows:

The angles \(\alpha\) and \(\beta\) are complementary, and \(\sin \alpha = \cos \beta\).

\(\varphi= b \cos \beta = b \sin \alpha\)

\(\psi = a \cos \alpha\)

\(c_v = \psi + \varphi = a \cos \alpha + b \sin \alpha\)

It follows that

\[M = \begin{bmatrix} 1 & 0 \\ \cos \alpha & \sin \alpha \end{bmatrix}\]

Therefore,

\[\begin{bmatrix} 1 & 0 \\ \cos\alpha & \sin \alpha \end{bmatrix} \begin{bmatrix} c_x \\ c_y \end{bmatrix}= \begin{bmatrix} c_u \\ c_v \end{bmatrix} = \begin{bmatrix} c_1 \\ c_2 \end{bmatrix} \]


\(G\) is the METRIC TENSOR turns the covariate into the contravariate:

\[\begin{align} \bf G &= H\,M^{-1}\\ &= \frac{1}{\sin\alpha}\begin{bmatrix}1&-\frac{1}{\tan\alpha}\\0&\frac{1}{\sin\alpha}\end{bmatrix} \begin{bmatrix} \sin\alpha & 0\\-\cos\alpha &1\end{bmatrix}\\ &=\frac{1}{\sin\alpha} \begin{bmatrix}\sin\alpha+\frac{\cos\alpha}{\tan\alpha} & -\frac{1}{\tan \alpha}\\ -\frac{\cos\alpha}{\sin\alpha} & \sin\alpha \end{bmatrix}\\ &=\frac{1}{\sin^2\alpha}\begin{bmatrix}1 & - \cos\alpha\\-\cos\alpha\end{bmatrix} \end{align}\]


We can prove that the length of a vector in oblique coordinates is INVARIANT under the change of coordinates.

\[L^2 = c_x^2 + c_y^2 = c_1\,c^1 + c_2\,c^2=u_k\,u^k\]

However, when we go from Cartesian to oblique we can’t describe the vector as \(A = (A_1,A_2)\) with the length calculated as the norm \(A_1^2 + A_2^2\) (or as the dot product). Instead, we need covariant \((A_1,A_2)\) and contravariant \((A^1,A^2)\) components with the length calculated as the dot product (or scalar product) - i.e. \(A_1A^1+A_2A^2.\)


The \(G\) matrix turns the variant into the contravariant: from subscript indices to superscript indices.

The Einstein notation is: \(g^{ij}\,u_j\), meaning the sum over all the \(j\) dimensions.

The “famous” metrix tensor is:

\[dS^2 = g^{ij}\,dx_i\,dx_j= dx^j\,dx_j = dx^1dx_1 +dx^2dx_2\]


Home Page

NOTE: These are tentative notes on different topics for personal use - expect mistakes and misunderstandings.