\(\color{blue}{\text{Expected}}\) value of a \(\color{red}{\text{power}}\) of a random variable.

We use LOTUS to calculate this expectation: the expected value of a function of the variable (in this case an exponentiation) is obtained by multiplying that function \(\times\) the pdf, and integrating:

\(\large \mathbb{E}[\color{blue}{X^k}] = \displaystyle\int_{-\infty}^{\infty}\color{blue}{X^k}\,\,\color{green}{\text{pdf}}\,\,\,dx\tag{def. of MOMENT *}\)

\(k\) is the number of the moment.

For the *mean*:

\(\large \mathbb{E}[{X^1}] = \displaystyle\int_{-\infty}^{\infty}X^1\,\,\text{pdf}\,\,\,dx\)

There are two types of moments:

Raw moment (moment about the origin). Fits perfectly the above definition, of \(\mu'_k=\mathbb{E}\,(X)^k= \mathbb{E}\,(X-0)^k\). The mean is the first

*raw*moment.Central moment: It is centered around the mean: \(\mu_k=\mathbb{E}\,(X-\mu)^k\). This is the moment that we need, for instance, to calculate the variance:

Variance is \(\mathrm{Var}[X] = \mathbb{E}\left [ \,(X-\mu)^2 \right ] = \displaystyle\int_{-\infty}^{\infty} (X-\mu)^2 \, \text{pdf}\,dx\).

Alternatively, it can be defined as the difference between the second and the \(\color{red}{\text{squared}}\) first raw moments: \(\mathbb{E}[X^2]\,-\,\mathbb{E}[X]^2.\)

- Mean: First
rawmoment.- Variance: Second
centralmoment.- Skew (asymmetry): Third
centralmoment, \(\color{blue}{\text{standardized}}\) i.e. divided by the \(\sigma^3\).- Kurtosis (peakedness): Fourth
centralmoment, but divided by \(\sigma^3\), AND subtacting \(-3\) from the result.

We can find \(\mathbb E[X^n]\) directly and calculate moments through:

\[\large \mathbb E[X^n]= \displaystyle \int_{-\infty}^{\infty} x^n f_X(x)\,dx.\]

Probability generating functions only work for discrete distributions.

The equation is:

\[\large \color{blue}{G(z) = \mathbb E[z^X]= \displaystyle \sum_{x=0}^\infty p_x\, z^x}\]

where \(p_x = \Pr\{X=x\}.\)

By differentiating and evaluating at \(1\) we get **factorial** moments (not raw moments):

\[G_X^{(r)}=\mathbb E[X(X-1)\cdots(X-r+1)]\]

\[\large M_X(t)=\displaystyle\int_{-\infty}^\infty e^{tx}dF(x)\]

Notice that the MGF is the Laplace transform with \(-s\) replaced by \(t\):

\(\large \mathscr L\{f\}(s)= \mathbb E [e^{-sX}].\)

They not always exist. The moments are calculated as:

\[\mathbb E[X^n]=M_X^{(n)}(0)=\frac{d^nM_X}{dt^n}(0)\]

They always exist:

\[\large \phi(t)=\mathbb E[e^{itX}]= \displaystyle \int_{-\infty}^{\infty} e^{itx}f_X(x)dx\]

Notice that the characteristic function is the Fourier Transform of probability density function with the caveat that in probability theory, the characteristic function \(\displaystyle \phi\) of the probability density function \(\displaystyle f\) of a random variable \(\displaystyle X\) of continuous type is defined without a negative sign in the exponential, and since the units of \(\displaystyle x\) are ignored, there is no \(\displaystyle 2\pi\) either (from Wikipedia):

\(\large \mathscr F\{f(x)\}=\displaystyle \int_{-\infty}^{\infty}e^{2\pi i k x}\,f(x)\,dx\)

The moments are calculated as:

\[\mathbb E[X^k]=(-i)^k\phi_X^{(k)}(0)\]