From this online post.

Survival analysis examines and models the time it takes for events to occur, termed survival time. The Cox proportional-hazards regression model is the most common tool for studying the dependency of survival time on predictor variables.

The documentation for the package is here.

In type 1 censoring, the investigator defines the period of observation as a fixed value. In type 2 censoring, the investigator defines the period of observation as a random variable after \(d\) number of failures.

- Survival Function \(S(t):\)

The probability that a person survives longer than the specified time. This is often expressed by Kaplan Meier Curve. This function can be thought as the complement of cumulative density function.

\[S(t) = P(T>t) = 1- P(T \le t)\]

- Probability density function (pdf) \(f_{T}(t)\):

\[f_{t}(t)= -S'(t)= \lim_{\delta \to 0^+} P(t \le T < t+\delta)\]

to recover the survivor function, it is always possible to take the integral of probability density function.

- Hazard Function:

The hazard function \(h(t)\) is defined as instantaneous potential per unit time for the event to occur given that the individual has survived up to time \(t\). In contrast to the survival function, hazard function focuses on the event of failing. The higher hazard function, the worse the impact on survival. The hazard function is defined as a rate rather than probability, the values of hazard function range between zero and infinity.

According to Cox and Oakes, knowing the survivor function is sufficient enough to be able to derive Hazard Function.

\[h(t) = \lim_{\delta \to 0} \Pr(t\le T < t+\delta|T>t)/ \delta\]

by definition of conditional probability,

\[h(t) = \lim_{\delta \to 0} \Pr\frac{(t\le T < t+\delta) \cap (T > t)}{ \delta \times \Pr(T>t)}\]

\[h(t) = \lim_{\delta \to 0} \frac{\Pr(t\le T < t+\delta)/\delta}{\Pr(T>t)}\]

By definition, hazard function is just pdf divided by survivor function.

\[h(t) = f_{T}(t) / S(t)\]

Cox-Proportional Hazard Model is a semiparametric hazard model. This proportional hazard model has two major components in the equation:

\(\mbox h(X,T) = h_0 (t) \exp {\left(\sum_i(\beta_iX_i)\right)}\) Where \(h_o(t)\) is the baseline hazard, and \(\exp{(\sum_i(\beta_iX_i))}\) is the exponential term. The major advantage of Cox-Proportional hazard model is that the baseline hazard does not need to be specified/semiparametric. In addition to that the exponential term of this Cox-Proportional Hazard Model will ensure that the Hazard function will always be non-negative.

One of the assumption of Cox-Proportional Hazard model is the time-independent covariates, or the covariates must not be changing over time. The Covariates like smoking status will easily failed this assumption, because people can change their smoking habit. For this reason, there are extension of this cox-proportional hazard model using joint probability technique.

There are Three Statistical Objectives in Cox-Proportional Hazard model: 1. Test for significance of the effect. 2. Point estimate of the effect 3. Convidence-Interval of the effect.

```
## week arrest fin age race wexp mar paro prio educ
## 1 20 1 no 27 black no not married yes 3 3
## 2 17 1 no 18 black no not married yes 8 4
```

```
## Call:
## coxph(formula = Surv(week, arrest) ~ fin + age + race + wexp +
## mar + paro + prio, data = Rossi)
##
## n= 432, number of events= 114
##
## coef exp(coef) se(coef) z Pr(>|z|)
## finyes -0.37942 0.68426 0.19138 -1.983 0.04742 *
## age -0.05744 0.94418 0.02200 -2.611 0.00903 **
## raceother -0.31390 0.73059 0.30799 -1.019 0.30812
## wexpyes -0.14980 0.86088 0.21222 -0.706 0.48029
## marnot married 0.43370 1.54296 0.38187 1.136 0.25606
## paroyes -0.08487 0.91863 0.19576 -0.434 0.66461
## prio 0.09150 1.09581 0.02865 3.194 0.00140 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## finyes 0.6843 1.4614 0.4702 0.9957
## age 0.9442 1.0591 0.9043 0.9858
## raceother 0.7306 1.3688 0.3995 1.3361
## wexpyes 0.8609 1.1616 0.5679 1.3049
## marnot married 1.5430 0.6481 0.7300 3.2614
## paroyes 0.9186 1.0886 0.6259 1.3482
## prio 1.0958 0.9126 1.0360 1.1591
##
## Concordance= 0.64 (se = 0.027 )
## Rsquare= 0.074 (max possible= 0.956 )
## Likelihood ratio test= 33.27 on 7 df, p=2.362e-05
## Wald test = 32.11 on 7 df, p=3.871e-05
## Score (logrank) test = 33.53 on 7 df, p=2.11e-05
```