### MOTIVATION FOR THE $$t$$ DISTRIBUTION:

[What follows was first written in a post in SE Mathematics]

The $$t$$-Student distribution is a “natural” $$pdf$$ to define.

If $$X_1, \ldots, X_n$$ are iid observations $$\sim N(\mu,\sigma^2)$$,

$$\Large\frac{\bar{X}\,-\,\mu}{\sigma/\sqrt{n}} \sim N(0,1)$$.

And $$\Large\frac{\bar{X}\,-\,\mu}{\sigma/\sqrt{n}}$$ is the formula of the Z-statistic.

If the $$X_1,\ldots,X_n$$ are not normally distributed, the expression will tend to $$\sim N(0,1)$$ as $$n\mapsto\infty$$, the basis for the central limit theorem (CLT).

If the standard deviation of the population, $$\sigma$$, is unknown we can replace it by the estimation based on a sample, $$S$$, but then the expression (one-sample t-test statistic) will follow a $$t$$-distribution:

$$\Large \frac{\bar{X}\,-\,\mu}{S/\sqrt{n}}\sim t_{n-1}$$

with $$\large S=\sqrt{\frac{\sum(X_i-\bar X)^2}{n-1}}$$ (Bessel’s correction).

Why?

$\Large\frac{\bar{X}\,-\,\mu}{S/\sqrt{n}} = \frac{\bar{X}\,-\,\mu}{\frac{\sigma}{\sqrt{n}}} \frac{1}{\frac{S}{\sigma}}= Z\,\frac{1}{\frac{S}{\sigma}} = \frac{Z}{\sqrt{\frac{\sum(X_i-\bar X)^2}{(n-1)\,\sigma^2}}} \sim\frac{Z}{\sqrt{\frac{\chi_{n - 1}^2}{n-1}}} \sim t_{n-1}\small \tag 1$

How did the chi square ($$\chi^2$$) made its entry into the $$pdf$$? It’s the distribution that models $$X^2$$ with $$X\sim N(0,1)$$.

In general, if $$X_1, \ldots, X_n$$ are $$\sim \chi^2_{1\,df}$$, then $$X_1 +\cdots+X_n \sim \chi^2\,\small(n\text{ df})$$ here.

And in the case of the $$t$$-distribution the chi-square is suitable to model the sum of squared normals $$\large \sum(X_i-\bar X)^2$$ in (1)), a well known property derived here, typically with $$n$$ degrees of freedom, but why is it $$n\,-\,1$$ here? In other words, why…

$$\Large \frac{\sum(X_i - \bar X)^2}{\sigma^2}$$ in equation (1) becomes $$\Large \chi_{n-1}^2$$ and not $$\Large \chi_{n}^2$$?

The explanation is in the answer here.

In a nutshell, there aren’t $$n$$ $$X_i$$ independently distributed random variables corresponding to the different observations, each one following a $$\chi^2$$ distribution when squared ($$\small(X-\bar X)^2$$). In fact there are only $$n-1$$ because of the insertion of the $$\bar X$$ in the mix.

### DERIVATION OF THE $$pdf$$ OF THE $$t$$ DISTRIBUTION:

First let’s recapute equation (1) by multiplying by $$\frac{\sigma}{\sigma}$$ the Z-statistic $$\Large \frac{\bar X - \mu}{\frac{\sigma}{\sqrt{n}}}$$:

$$\Large t= \frac{\bar{X}-\mu}{S/\sqrt{n}}=\frac{\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}}{\sqrt{S^2/\sigma^2}} = \frac{Z}{\sqrt{\frac{\sum(X_i-\bar X)^2}{(n-1)\,\sigma^2}}} \small \tag1$$

In the above expression, $$(\bar{X}-\mu)/(\mu/\sqrt{n})\,\,\sim \,\,N(0,1)$$

and $$\sqrt{S^2/\sigma^2} =\sqrt{ \frac{\sum_{x=1}^n(X - \bar{X})^2}{n-1}/\sigma^2} \,\,\sim \,\,\sqrt{\chi_{(n-1)}^2/(n-1)}$$.

Assigning $$U = (\bar{X}-\mu)/(\mu/\sqrt{n})$$ and $$V=\sqrt{S^2/\sigma^2}$$, characterized as $$U \sim N(0,1)$$ and $$V \sim \chi_k^2$$, expression 1 becomes $$U/\sqrt{V/k}$$.

With the premise of independence, the joint density is:

$$\Large f_{U,V}(u,v) = \frac{1}{(2\pi)^{1/2}} e^{-u^2/2} \frac{1}{\Gamma(\frac{k}{2})\,2^{k/2}}\,v^{(k/2)-1}\, e^{-v/2}$$ with $$-\infty<u<\infty$$ and $$0<v<\infty$$.

Making the transformation $$\Large t=\frac{u}{\sqrt{v/k}}$$ and $$\Large w=v$$, (hence, $$\Large u=t\,(\frac{w}{k})^{1/2}$$), and with $$\Large (w/k)^{1/2}$$ as the Jacobian, the marginal pdf will be:

$$\large f_T(t) = \int_0^\infty \,f_{U,V}\bigg(t\,(\frac{w}{k})^{1/2},w\bigg)(w/k)^{1/2}dw$$

$$=\Large \frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma(\frac{k}{2})2^{k/2}}\, \int_0^\infty\, e^{-\frac{\bigg(t(\frac{w}{k})^{1/2}\bigg)^2}{2}} w^{(k/2)-1} e^{-(\frac{w}{2})} \frac{w^{1/2}}{k^{1/2}}\,dw$$

$$=\Large \frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma(\frac{k}{2})2^{k/2}k^{1/2}}\, \Large \int_0^\infty\, w^{((k+1)/2)-1}\, e^{-(1/2)(1 + t^2/k)w} \,dw$$

The next step entails identifying in the previous equation the kernel of a gamma distribution pdf:

$$\Large x^{\alpha-1}\,e^{x\,\lambda}$$

with parameters $$\large (\alpha=(k+1)/2,\,\lambda=(1/2)(1+t^2/k))$$.

The generic pdf for the gamma distribution is,

$$\Large \frac{\lambda^\alpha}{\Gamma(\alpha)}\,x^{\alpha-1}\,e^{x\,\lambda}$$

The strategy is then to synthesize the entire gamma pdf within the improper integral in our $$f_T(t)$$ pdf in progress, so that we can simplify it as just $$1$$, as we know to be true of all pdf’s. To get away with it we need to multiply numerator and denominator by the same coefficient:

$$\Large \frac{\Gamma(\alpha)\,\lambda^\alpha}{\Gamma(\alpha)\,\lambda^\alpha}$$. And since neither $$\alpha$$ nor $$\lambda$$ include the integrating factor $$w$$ we can include them inside the integral, or leave them out. Naturally, we want to leave within the integral $$\Large \frac{\lambda^\alpha}{\Gamma(\alpha)}$$, and keep $$\Large \frac{\Gamma(\alpha)}{\lambda^\alpha}$$ outside the integral. Now $$f_T(t)$$ will look hideous for just one second:

$$\Large= \frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma(\frac{k}{2})2^{k/2}k^{1/2}}\, \int_0^\infty\frac{((1/2)(1+t^2/k))^{(k+1)/2}}{\Gamma((k+1)/2)}$$ $$w^{((k+1)/2)-1} e^{-(1/2)(1 + t^2/k)w} dw\, \frac{\Gamma((k+1)/2)}{((1/2)(1+t^2/k))^{(k+1)/2}}$$

… because everything between $$\int$$ and $$dw$$ is just the gamma $$pdf$$ integrated over its entire support, so it becomes $$1$$, and we are left with:

$$=\Large \frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma(\frac{k}{2})2^{k/2}k^{1/2}}\, \frac{\Gamma((k+1)/2)}{((1/2)(1+t^2/k))^{(k+1)/2}}$$

$$=\Large \frac{1}{(2\pi)^{1/2}}\frac{1}{\Gamma(\frac{k}{2})2^{k/2}k^{1/2}}\,\Gamma((k+1)/2)\, \Big[\frac{2}{(1+t^2/k)}\Big]^{(k+1)/2}$$

$$=\Large \frac{\Gamma(\frac{k+1}{2})}{\Gamma(\frac{k}{2})}\, \frac{1}{(2\pi)^{1/2}2^{k/2}k^{1/2}}\, \Big[\frac{2}{(1+t^2/k)}\Big]^{(k+1)/2}$$

$$=\Large \frac{\Gamma(\frac{k+1}{2})}{\Gamma(\frac{k}{2})}\, \frac{1}{(2\pi)^{1/2}2^{k/2}k^{1/2}}\, \frac{2^{(k+1)/2}}{(1+t^2/k)^{(k+1)/2}}$$

$$=\Large \frac{\Gamma(\frac{k+1}{2})}{\Gamma(\frac{k}{2})}\, \frac{1}{(\pi)^{1/2}k^{1/2}}\, \frac{1}{(1+t^2/k)^{(k+1)/2}}$$

$$\Large f_T(t)= \frac{\Gamma(\frac{k+1}{2})}{\Gamma(\frac{k}{2})}\, \frac{1}{\sqrt{k\,\pi}}\, \Big(1+\frac{t^2}{k}\Big)^{-\frac{k+1}{2}}$$

which is the $$pdf$$ of the $$t$$-Student or Gosset distribution with $$k$$ degrees of freedom.