[What follows was initially posted in CrossValidated].

I will use the following toy tabulated data:

```
Antacid <- matrix(c(64, 178 - 64, 92, 190 - 92), nrow = 2)
dimnames(Antacid) = list(Symptoms = c("Heartburn", "Normal"),
Medication = c("Drug A", "Drug B"))
addmargins(Antacid)
```

```
## Medication
## Symptoms Drug A Drug B Sum
## Heartburn 64 92 156
## Normal 114 98 212
## Sum 178 190 368
```

So we have `368`

patients: `178`

on `Drug A`

, and `190`

on `Drug B`

and we try to see if there are differences in the proportion of heartburn symptoms between drug A and B, i.e. \(p1 = 64/178\) vs \(p2 = 92/190\).

**1.** FISHER EXACT TEST: There is a discussion on Wikipedia about “Controversies”. Based on the hypergeometric distribution, it is probably most adequate when the expected values in any of the cells of a contingency table are below 5 - 10. The story of the RA Fisher and the tea lady is great, and can be reproduced in [R] by simply grabbing the code here. [R] seems to tolerate without a pause the large numbers in our data (no problem with factorials):

```
Antacid <- matrix(c(64, 178 - 64, 92, 190 - 92), nrow = 2)
fisher.test(Antacid, alternative = "two.sided")
Fisher's Exact Test for Count Data
data: Antacid
p-value = 0.02011
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.3850709 0.9277156
sample estimates:
odds ratio
0.5988478
```

**2.** CHI-SQUARE TEST OF HOMOGENEITY: For larger samples (> 5 expected frequency count in each cell) the \(\chi^2\) provides an approximation of the significance value. The test is based on calculating the *expected frequency counts* obtained by cross-multiplying the marginals (assuming normal distribution of the marginals, it makes sense that we end up with a \(\chi^2\) distributed test statistic, since if \(X\sim N(\mu,\sigma^2)\), then \(X^2\sim \chi^2(1))\):

```
Medication
Symptoms Drug A Drug B
Heartburn 156 * 178 / 368 = 75 156 * 190 / 368 = 81
Normal 212 * 178 / 368 = 103 212 * 190 / 368 = 109
```

This can be more easily calculated as:

`(addmargins(expect <- chisq.test(Antacid)$expected))`

```
## Medication
## Symptoms Drug A Drug B Sum
## Heartburn 75.45652 80.54348 156
## Normal 102.54348 109.45652 212
## Sum 178.00000 190.00000 368
```

The degrees of freedom will be calculated as the {number of `populations`

(`Heartburn`

sufferers and `Normals`

, i.e. `2`

) * minus 1* } * {number of levels in the

`categorical variable`

(`Drug A`

and `Drug B`

, i.e. `2`

) `minus 1`

`2x2`

table we are dealing with `1 d.f`

. And crucially, a \(\chi^2\) of \(1\,df\) is exactly a `squared`

\(N \sim (0,1)\) (proof here), which explains the sentence “a chi-square test for equality of two proportions is exactly the same thing as a z-test. The chi-squared distribution with one degree of freedom is just that of a normal deviate, squared. You’re basically just repeating the chi-squared test on a subset of the contingency table” in this post.The Test Statistic is calculated as:

\(\chi^2=\frac{(64-75)^2}{75} + \frac{(92-81)^2}{81} +\frac{(114-103)^2}{103} + \frac{(98-109)^2}{109} = 5.39\), although this is an approximation excluding decimals. The precise calculation of these values is the sum of the cells in:

`(residuals <- chisq.test(Antacid)$residuals^2)`

```
## Medication
## Symptoms Drug A Drug B
## Heartburn 1.739437 1.629578
## Normal 1.279963 1.199124
```

`sum(residuals)`

`## [1] 5.848102`

This is calculated in `R`

with the function `prop.test()`

or `chisq.test()`

, which should yield the same result, as indicated here:

` prop.test(Antacid, correct = F)`

```
##
## 2-sample test for equality of proportions without continuity
## correction
##
## data: Antacid
## X-squared = 5.8481, df = 1, p-value = 0.01559
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.22976374 -0.02519514
## sample estimates:
## prop 1 prop 2
## 0.4102564 0.5377358
```

The proportions are calculated as: \(64/156 = 0.4102564\) and \(114/212 = 0.5377358\). The confidence interval makes reference to the difference in proportions: \(0.4102564 - 0.5377358 = -0.1274794.\)

We don’t need to feed a matrix. A vector of “successes” (in this case heartburn: `x <- c(64, 114)`

) with the total number of cases (`n <- c(156, 212)`

) will result in the same output:

`prop.test(x = c(64, 114), n = c(156, 212), correct = F)`

```
##
## 2-sample test for equality of proportions without continuity
## correction
##
## data: c(64, 114) out of c(156, 212)
## X-squared = 5.8481, df = 1, p-value = 0.01559
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.22976374 -0.02519514
## sample estimates:
## prop 1 prop 2
## 0.4102564 0.5377358
```

or..

` chisq.test(Antacid, correct = F)`

```
##
## Pearson's Chi-squared test
##
## data: Antacid
## X-squared = 5.8481, df = 1, p-value = 0.01559
```

**3.** G-TEST: The Pearson’s chi-test statistic is the second order Taylor expansion around 1 of the G test; hence they tend to converge. In `R`

:

```
library(DescTools)
GTest(Antacid, correct = 'none')
Log likelihood ratio (G-test) test of
independence without correction
data: Antacid
G = 5.8703, X-squared df = 1, p-value = 0.0154
```

**4.** Z-TEST OF PROPORTIONS: The normal distribution is a good approximation for a binomial when \(np>5\) and \(n(1-p)>5\). When the occurrences of successes are small in comparison with the total amount of observations, it is the actual number of expected observations that will determine if a normal approximation of a poisson process can be considered (\(\lambda \geq 5\)).

Although the post hyperlinked is old, I haven’t found in CV an `R`

function for it. This may be due to the fact explained above re: \(\chi^2_{(df=1)}\sim \, N_{(0,1)}^2\).

The Test Statistic (TS) is:

\(\displaystyle Z =\frac{\frac{x_1}{n_1}-\frac{x_2}{n_2}}{\sqrt{p\,(1-p)(1/n_1+1/n_2)}}\) with \(\displaystyle p = \frac{x_1\,+\,x_2}{n_1\,+\,n_2}\), where \(x_1\) and \(x_2\) are the number of “successes” (in our case, sadly, heartburn), over the number of subjects in that each one of the levels of the categorical variable (`Drug A`

and `Drug B`

), i.e. \(n_1\) and \(n_2\).

For a double-tailed test the \(p\)-value will be calcuated as the \(p(|Z|\geq TS)\), which in [R] corresponds to `2 * pnorm(ts,lower.tail = F)`

with `ts = test statistic`

.

In the linked page there is an *ad hoc* formula. I have been toying with a spin-off with a lot of loose ends. It defaults to a two-tailed alpha value of `0.05`

, but can be changed, as much as it can be turned into a one tailed `t = 1`

:

```
zprop = function(x1, x2, n1, n2, alpha=0.05, t = 2){
nume = (x1/n1) - (x2/n2)
p = (x1 + x2) / (n1 + n2)
deno = sqrt(p * (1 - p) * (1/n1 + 1/n2))
z = nume / deno
print(c("Z value:",abs(round(z,4))))
print(c("Cut-off quantile:",
abs(round(qnorm(alpha/t),2))))
print(c("pvalue:", pnorm(-abs(z))))
}
```

In our case:

```
zprop(64, 92 , 178, 190)
[1] Z value: 2.4183
[1] Cut-off quantile: 1.96
[1] pvalue: 0.0077
```

Giving the same z value as the function in the R-Bloggers: `z.prop(64, 178 , 92, 190) [1] -5.44273`

An alternate test statistic is the Wald test:

*test statistic* = \(\Large \frac{\hat p_1 - \hat p_2}{\sqrt{\frac{\hat p_1(1-\hat p_1)}{n_1}+\frac{\hat p_2(1-(\hat p_2))}{n_2}}}\)

which is useful to create confidence intervals for the difference:

\(\large \hat p_1 - \hat p_2 \pm Z_{1-\alpha/2} \sqrt{\frac{\hat p_1(1-\hat p_1)}{n_1}+\frac{\hat p_2(1-(\hat p_2))}{n_2}}\)

Original post here