A population version of Spearman’s rank correlation has been defined in the case of continuous variables.
Consider a population distributed according to two variates X1 and X2. Two members (X1,X2) and (X′1,X′2) of the population will be called concordant if:
X1<X′1,X2<X′2 OR X1>X′1,X2>X′2.
They will be called discordant if:
X1<X′1,X2>X′2 OR X1>X′1,X2<X′2. The probabilities of concordance and discordance are denoted with Pc, and Pd respectively. The population version of Spearman’s ρ is defined as proportional to the difference between the probability of concordance, and the probability of discordance for two vectors (X1,X2) and (X′1,X′2), where (X1,X2) has distribution FX1X2 with marginal distribution functions FX1 and FX2 and X1, X2 are independent, and (X1,X2) and (X′1,X′2) are independent.
ρ=3(Pr The above definition is valid only for populations for which the probabilities of X_1 = X_1' or X_2 = X_2' are zero. The main types of such populations are an infinite population with both X_1 and X_2 distributed continuously, or a finite population where X_1 and X_2 have disjoint ranges.
This seems virtually identical to the concept of population Kendall’s tau:
In trying to illustrate pictorially the intuition behind the definition of the population Kendall’s tau as
\tau (X_1, X_2) =\Pr\left[(X_1-X_1')\,(X_2 - X_2') >0 \right]- \Pr\left[(X_1-X_1')\,(X_2 - X_2') <0 \right]\tag 1
and the relationship both 1. Between the iid random vector (X_1,X_2) and its copy (X_1', X_2'); as well as 2. Between these random vectors (2-tuples of dependent random variables) and their joint and marginal densities and distributions (pdf and cdf’s), we can take a look at a somewhat illustrative value of X_1 in a bivariate normal distribution with a covariance \mathrm{cov}(X_1,X_2)=0.2.
In the example, X_1\sim N(0,0.25) and X_2 \sim N(0,0.25). The Pearson correlation is, therefore,
\rho(X_1,X_2)=\frac{\mathrm{cov}(X_1,X_2)}{\sigma_{X_1}\,\sigma_{X_2}}=\frac{0.2}{\sqrt{0.25\times0.25}}=0.8.
If we consider an illustrative draw from X_1 towards the left of the distribution, e.g. x_1=-1, the conditional mean of X_2 given this value of X_1 will be given by
\begin{align} \mathbb E(X_2|X_1=x)=\mu_Y+\rho \dfrac{\sigma_Y}{\sigma_X}(x-\mu_X)= -0.8 \end{align}
and these conditional mean values will increase from left to right linearly as on the following plot:
and assuming constant variance, its value will be \mathrm{var}(X_2\vert X_1=x)=\sigma^2_{X_2}\,(1-\rho^2)=0.09
Now, looking at the first part of equation (1), (X_1-X_1') will be negative whenever the independent draw from the identical rv X_1' is less negative than x_1, which is highly probable: \Pr(x_2 > -1) = 0.977.
Looking at second term of the multiplication, i.e. (X_2-X_2'), we know that \mathbb E[X_2\vert X_1=-1] is below the mean of the marginal distribution of X_2, which was designed to be \mu_{X_2}=0.
Therefore, precisely because it is highly likely that x_1' > x_1, rendering (X_1-X_1')<0, it is also going to be more likely for x_2'>x_2, since it will more probably come from a conditional normal distribution with \mathbb E[X_2'\vert X_1'=x_1' ] > \mathbb E[X_2\vert X_1=-1 ], in which case $(X_2-X_2’) $ will also be negative, rendering (X_1-X_1')\,(X_2 - X_2') >0.
The same argument (inverted) holds if we were to look at a value of x_1 =+1.
Therefore, the positive correlation imposed on this bivariate normal, would indeed result in a positive Kendall \tau if we swept (integrate) from -\infty to +\infty, to the extent that \Pr\left[(X_1-X_1')\,(X_2 - X_2') >0 \right]> \Pr\left[(X_1-X_1')\,(X_2 - X_2') <0 \right].
The inverse would be easy to show if the correlation decided upon had been negative.
The objective is to look at dependency structure in a vector of random variables, knowing their marginal distributions.
We want to convert the random variables and convert them to U(0,1).
It is well known that a real-valued, continuous, and strictly monotone function of a single variable possesses an inverse on its range. It is also known that one can drop the assumptions of continuity and strict monotonicity (even the assumption of considering points in the range) to obtain the notion of a generalized inverse. One can often work with generalized inverses as one does with ordinary inverses.
For an increasing function T : \mathbb R \to \mathbb R with T(−\infty) = \lim_{x \downarrow −\infty} T(x) and T(\infty) =\lim_{x \uparrow \infty} T(x), the generalized inverse
T^- : \mathbb R \to \bar{\mathbb R} = [−\infty, \infty]
of T is defined by
T^-(y)=\inf\{x\in \mathbb R: T(x) \geq y\}, y\in \mathbb R,
with the convention that \inf \emptyset =\infty. If T:\mathbb R\to [0,1] is a distribution function, T^-:[0,1]\to \bar {\mathbb R} is also called a quantile function of T.
In this context T is the cdf F. We have that F(F^-(y))=y. And X \sim F^-(u).
[From here on, instead of y we will use u]
So if we have a vector of random variables, \{X_1, \dots,X_n\} and we think of them as the variable u, we have that X_i = F^-_i(u) and F(X_i) = U(0,1). So each variable will be transformed as \{F_1,\dots,F_n\}.
We can say that F_i(X_i) \sim U_i and F_i^- (U_i) \sim X_i.
We can look at the vector \vec u =(F_1(X_1), \dots, F_n(X_n)), and call the joint distribution of vector \vec u the copula:
\Pr(U_1 \leq u_1, \dots, U_n \leq u_n).
So the copula, \color{red}C is a joint distribution function on [0,1]^d with marginal distributions standard uniform:
For example, if we assume independence between X_i variables,
C(\vec u) = \Pr(U_1 \leq u_1, \dots, U_d \leq u_d) = \Pr(F_1(X_1)\leq u_1, \dots, F_d(X_d) \leq u_d)=\Pi_{i=1}^d (F_i(X_i)\leq u_i)= \Pi_{i=1}^d u_i.
This is the independence copula.
In the case of a bivariate independent copula (here), C(u,v)=uv, the plot will be
LOoking at the surface plot from the side, it is easy to see that the conditions of a copula
Sklar’s theorem:
Given a joint distribution function F with marginal distribution functions F_1,\dots, F_d, there exists a function C (copula), such that F(X_1,\dots, X_d)=C(F_1(x_1),\dots,F_d(x_d)), and that for F continuous, C is uniquely given by C(u_1,\dots,u_d)= F(F_1^-(u_1),\dots,F_d^-(u_d)).
Linear correlation does not translate well into the copula space, but rank correlation does - Spearman or Kindall;
\begin{align} \tau(X_1, X_2) &= 4 \,\mathbb E\left[ C(U_1, U_2) \right]-1\\[2ex] &=4\int C(u,v)dC(u,v)-1 \end{align}
\begin{align} \rho (X_1, X_2) &=12 \, \mathbb E [U_1, U_2] -3 \\[2ex] &=12\int\int uv\,dC(u,v)-3\\[2ex] &=r(U_1, U_2) \end{align}
The coefficient of upper tail dependence tells us how high values of one variable relate to high values of the other:
\lambda_U(X_1, X_2) \lim_{U\to 1}\frac{1 + C(U, U) -2U}{1-U}
Random variable X_1, \dots, X_d are comonotone if their distribution can be written as
(X_1,X_2)\overset{D}=\alpha_1(Z), \dots, \alpha_d(Z))
for the same stochastic variable Z and strictly increasing functions \alpha_i.
Only defined for two variables, whose distribution can be expressed as
(X_1, X_2) \overset{D}=\alpha(Z), \beta(Z) with alpha increasing and beta decreasing.
It was blamed for the financial meltdown of 2008, because it was used to calculate the risk of default of financial instruments.
If \vec X = X_1, X_2, \dots are normally distributed \mathcal N(0,\Sigma) with \Sigma being the dependence matrix with 1’s in the diagonal,
C_\Sigma^\mathrm{Ga}(\vec u)=\Phi_\Sigma \left(\Phi^{-1}(u_1), \dots, \Phi^{-1}(u_d) \right) with \Phi_\Sigma being the joint distribution function; and \Phi^{-1} the inverse for the standard normal.
We can take \vec Y as the linear transformation \vec Y = \vec \mu + \beta \vec X with \beta corresponding to a diagonal matrix (0’s off diagonal) with \sigma_1,\dots,\sigma_d in the diagonal. Then,
Y\sim \mathcal N(\vec \mu, \beta^\top \Sigma \beta)
which has the same copula.
In the bivariate case, if X and Y are marginally distributed as \mathcal N(0,1), and have a joint distribution determined by the covariance matrix \begin{bmatrix}1 &\rho\\\rho&1 \end{bmatrix} with \rho being the correlation. To build up such joint distribution we can resort to Z\sim \mathcal N(1,0) with \mathrm{cor}(X,Z)=0. And we define Y= \rho X+\sqrt{1-\rho^2} Z. Because X and Z have expected value of zero, so will Y. As for the variance of Y,
\begin{align} \sigma^2(Y)&=\sigma^2\left( \rho X+\sqrt{1-\rho^2} Z \right)\\[2ex] &= \rho^2\mathrm{var}(X)+\left(\sqrt{1-\rho^2} \right)^2\mathrm{var}(Z)+2\rho\sqrt{1-\rho^2}\mathrm{cov}(X,Z)\\[2ex] &= \rho^2 + 1 -\rho^2=1 \end{align}
because \mathrm{cov}(X,Z)=0.
So, since Y is a linear combinations of two mutually independent normal random variables, it will be Gaussian with mean 0, variance 1, and correlated to X through \rho.
This is reflected in this answer in CV.SE in the function
correlatedValue = function(x, r){
r2 = r**2
ve = 1-r2
SD = sqrt(ve)
e = rnorm(length(x), mean=0, sd=SD)
y = r*x + e
return(y)
}
where r=\rho and Y= \rho X+ \sqrt{1-\rho^2}Z
A copula that can be expressed as
C(u,v) = \varphi^{[-1]}\left( \varphi(u) + \varphi(v) \right)
where \varphi is a strictly decreasing, convex continuous function, called the generator
and \varphi^{[-1]} is the pseudo-inverse of it.
\varphi^{[-1]}(t) is \varphi^{-1} (i.e. the actual inverse function) if 0\leq t \leq \varphi(0), and \varphi^{[-1]}(t)=0 if \varphi(0)\leq t < \infty.
Example
Let \varphi_\theta(t) = (1-t)^\theta for 1\leq \theta < \infty
Now \varphi_\theta(0) =1 and
\varphi^{[-1]}(t)=(1-t)^{-\theta} at 0\leq t \leq 1 and zero otherwise.
The copula through this generator is
\begin{align} C(u,v) &= \varphi^{[-1]}(\varphi(u)+\varphi(v))\\[2ex] &= \varphi^{[-1]}\left( (1-u)^\theta + (1-v)^\theta \right)\\[2ex] &=\left(1 - \left [(1-u)^\theta + (1-v)^\theta \right] \right)^{-\theta} \end{align}
From Wikipedia:
So, for instance, in the case of the Cayton’s copula, the generator is
\varphi(u) =\frac{u^{-\theta}-1}{\theta} which can be proven to be strictly decreasing and convex (second derivative).
The copula, will be given by C(u_1, u_2) = \varphi^{-1}\left(\varphi(u_1) + \varphi(u_2) \right). Calculating
\varphi^{-1}(t) = \left(t\,\theta + 1\right)^{-1/\theta} with \theta > 0. Now we can go back to the general AC formula and substitute
C(u_1, u_2) = \left( u_1^{-\theta} + u_2^{-\theta} -1 \right)^{-1/\theta}
In Archimedean copulas Kendall’s tau can be reduced to
\tau = 1 + 4 \int_0^1 \frac{\varphi(u)}{\varphi'(u)}du
and the sample estimate of Kendall’s tau
\hat \tau = \Pr\left[(x_i- x_j)(y_i - y_j) >0 \right ]- \Pr\left[ (x_i - x_j)(y_i -y_j) <0\right]= \frac{c}{\binom{n}{2}}-\frac{d}{\binom{n}{2}}
Using \tau = 1 + 4 \int_0^1 \frac{\varphi(u)}{\varphi(u)}du, get the generator. For instance tau is
\tau = \frac{\theta}{\theta+2}
for Clayton’s copula.
\tau = 1 - \frac{4}{\theta}\left[ \frac{1}{x}\int_0^\theta \frac{t}{\exp(t)-1}dt (-\theta)-1 \right]
for Frank’s copula.
\tau = \left(\frac{3\theta-2}{\theta} \right)-2/3 \left(1 -\frac{1}{\theta} \right)^2 \ln(1-\theta)
for Ali-Mikhail-Haq copula. And
\tau = \frac{\theta -1}{\theta}
for Gumble-Hougard.
Simulation with a particular copula to generate a dependent pair U and V:
U = [\sim U(0,1)]; T=[\sim U(0,1)]; S= \varphi'(U)/T; W=\varphi^{'(-1)}(S); and V=\varphi^{[-1]}(\varphi(W) - \varphi(U))
NOTE: These are tentative notes on different topics for personal use - expect mistakes and misunderstandings.