### COMPARING SAMPLE TO POPULATION PROPORTION - Z-TEST, $$\chi$$-SQUARE, and BINOMIAL TEST:

Motivating example:

The national proportion of people experiencing complications after having a particular operation in hospitals is $$20 \%.$$ A hospital decides to take a sample of sized $$n=20$$ from their records. Find the critial regions, at $$5\%$$ level of significance, to test whether or not their proportion of complications differs from the national population. The probability in each tail should be as close to $$2.5\%$$ as possible.

The standardized z value at this level of significance is

qnorm(0.975)
##  1.959964

which corresponds to a proportion $$\hat p$$ value to reject $$H_0$$ of

$\hat p = p \pm z \sqrt{\frac{p(1-q)}{n}}$

or

p = 0.2
n = 20
p + c(-1,1) * qnorm(0.975) * sqrt((p * (1 - p)) / n) 
##  0.02469549 0.37530451 Second example:

Calculate the CI and margin of errors of proportion of university females based on the data survey within the package MASS.

This is extracted from here:

library(MASS)
gender.response = na.omit(survey\$Sex)

n = length(gender.response)
k = sum(gender.response == "Female")
pbar = k/n # Proportion of females among respondents
SE = sqrt((pbar * ( 1 - pbar)) / n)  # Standard error
E = qnorm(1 - 0.05/2) * SE        # Margin of error
CI = pbar + c(- E, E)              # Confidence interval
paste('At',100*(1 - 0.05),'percent confidence level, between', 100 * round(pbar-E,3),
'percent and', 100 * round(pbar+E, 3),
'percent of the university students are female.')
##  "At 95 percent confidence level, between 43.6 percent and 56.4 percent of the university students are female."
paste('The margin of error is', 100 * round(E,3),'percent.')
##  "The margin of error is 6.4 percent."

equivalently, this can be done with

prop.test(k, n, correct=F) 
##
##  1-sample proportions test without continuity correction
##
## data:  k out of n, null probability 0.5
## X-squared = 0, df = 1, p-value = 1
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
##  0.4367215 0.5632785
## sample estimates:
##   p
## 0.5

If we want to test whether a one sample proportion $$\hat p$$ is consistent with a population parameter $$p$$ the score test statistic is:

$\text{test statistic} = \large \frac{\hat p - p}{\sqrt{\frac{p\,(1\,-\,p)}{n}}}$

This is equivalent to the z test statistic for sample means, and does follow a Z distribution in large samples. Compare to the z test of proportions to compare two samples here.

The expression in the denominator is all under a squared root because the standard deviation of the Bernoulli distribution is $$\sqrt{p(1 - p)}.$$

Notice that since we are testing under the $$H_0$$ how likely it is to get the $$\hat p$$ we obtained, we use the population proportion $$p$$ to calculate the standard error, as opposed to the sample proportion we would use to build the confidence interval with a Wald interval when the population parameter is not known.

Example:

We want to test if the proportion of side effects is greater than $$p=0.1$$ for a new drug. In our sample of $$n=20$$ cases, $$11$$ had complications.

p     <- 0.1
p.hat <- 11/22
n     <- 20
paste('The z statistic is', z_stat <- (p.hat - p) / sqrt(p * (1 - p) / n))
##  "The z statistic is 5.96284793999944"

This is greater than

qnorm(0.95) # One-sided test
##  1.644854

We can do all this with prop.test() as explained here:

success <- 11
n <- 20
p <- 0.1
prop.test(success, n=n, p=p, alternative="greater", conf.level=0.95, correct = F)
## Warning in prop.test(success, n = n, p = p, alternative = "greater", conf.level
## = 0.95, : Chi-squared approximation may be incorrect
##
##  1-sample proportions test without continuity correction
##
## data:  success out of n, null probability p
## X-squared = 45, df = 1, p-value = 9.852e-12
## alternative hypothesis: true p is greater than 0.1
## 95 percent confidence interval:
##  0.3722077 1.0000000
## sample estimates:
##    p
## 0.55

Notice that the CI does not include the population probability $$p=0.1.$$ We reject the null hypothesis.

Alternatively, we can just run an exact binomial test, which doesn’t need to rely on the normal approximation:

$\Pr(\text{counts} \geq 11)= \displaystyle \sum_{11}^{20} {20\choose X} \,0.1^X\, 0.9^{20-X}$

With [R],

pbinom(10, 20, .1, lower.tail = F) # Lower.tail false implies that it will count 11 and above.
##  7.088606e-07

Alternatively (same result),

binom.test(11, 20, 0.1, alternative = "greater")
##
##  Exact binomial test
##
## data:  11 and 20
## number of successes = 11, number of trials = 20, p-value = 7.089e-07
## alternative hypothesis: true probability of success is greater than 0.1
## 95 percent confidence interval:
##  0.3469314 1.0000000
## sample estimates:
## probability of success
##                   0.55

For two-sided tests calculate both one-sided tests and double the smallest p-value.

Example:

This question appeared here.

A report says that $$82\%$$ of British Columbians over the age of $$25$$ are high school graduates. A survey of randomly selected residents of a certain city included $$1290$$ who were over the age of $$25$$, and $$1012$$ of them were high school graduates. Is the city’s result of $$1012$$ unusually high, low, or neither?

First off let me get the algebraic nomenclature out of the way - I find this extremely slippery and often implied:

1. $$\pi_0$$ is a reference value assumed to be true. It is not necessarily the population proportion, but rather a fixed fraction or proportion to which we compare the sample to. For instance, the problem reads something along the lines: “Is our sample consistent with a population proportion of $$\small \pi_0 = 0.7$$?”

2. $$\pi$$ (or $$p$$) stands for the actual population proportion, but it’s too bad that we usually don’t know it and have to use instead the…

3. $$\hat \pi$$ (or $$\hat p$$), which stands for the sample proportion. To make things more “friendly” sometimes $$p$$ denotes the sample proportion…

4. $$n$$ is the number of trials in the binomial experiment (or the number of sampled subjects in a poll).

5. $$\small Y$$ number of “successes” (“success” interpreted sometimes like the word “positive” in Medicine - you don’t necessarily want it for yourself).

In our case we have $$\small \pi = 0.82$$ and $$\small \hat \pi = 1012/1290 = 0.78$$. And $$\small n = 1290$$.

The MLE of $$\pi$$ is the sample proportion, $$\hat \pi = \small \text{successes/trials}$$, and the expectation for the number of $$\small\text{successes}$$ is $$n\pi$$. The sample proportion is an unbiased estimator of the population: $$\small E(\hat \pi)=\pi$$ (and $$\small E(Y)=n\pi$$) and the standard error behaves very similarly to that of sampling distributions of sample means: $$\small SE\,(\hat \pi) = \sqrt{\frac{\pi(1-\pi)}{n}}$$, remembering that $$\text{var}(\hat \pi)= \pi(1-\pi)$$ (and $$\text{var}(Y)=n\pi(1-\pi)$$).

The test here is a two-sided one-sample proportion test: $$H_0: \pi = \pi_0$$ versus $$H_A: \pi \neq \pi_0$$. Typically, a normal approximation with mean $$\pi$$ and $$\text{var} = \pi(1-\pi)/n$$ under the following conditions: $$n\hat\pi>5$$ and $$n(1-\hat\pi)>5$$. In our case this is clearly met ($$\small 1290 * 0.78 = 1006$$).

The $$z$$ test statistic is

$z=\Large\frac{\hat\pi-\pi_0}{\sqrt{\frac{\pi_0\,(1-\pi_0)}{n}}}$ In our case, $$z =\large \frac{0.78 - 0.82}{\sqrt{\frac{0.82(1-0.82)}{1290}}}=\large \frac{-0.04}{\sqrt{\frac{0.15}{1290}}}= \small-3.32$$

p <- 0.82
n <- 1290
p.hat <- 1012/n
paste('The z-statistic is', z.statistic  <- (p.hat - p) / sqrt((p * (1 - p))/n))
##  "The z-statistic is -3.31915431422705"

which is clearly significant since

c(-1,1) * qnorm(1 - 0.05/2)
##  -1.959964  1.959964

This latter expression corresponding to the [R] code for the two-tailed cut-off quantile values fixing the alpha significance level at 5%: $$z_{(1 - \alpha/2)}$$ where $$\small \alpha = 0.05$$.

As for the Wald confidence intervals, the calculation is:

$$\large \hat \pi \pm z_{(1-\alpha/2)}\,\sqrt{\frac{\hat\pi(1-\hat\pi)}{n}}$$. Coded in [R]:

p.hat + c(-1,1) * qnorm(1 - 0.05/2) * sqrt((p.hat * (1 - p.hat)) / n)
##  0.7620585 0.8069337

which does not include $$\small \pi_0 = 0.82$$.

Since we know the population proportion $$\pi$$ in this case as being $$0.82$$ we can use it to construct the confidence interval as:

p + c(-1,1) * qnorm(1 - 0.05/2) * sqrt((p * (1 - p)) / n)
##  0.7990349 0.8409651

This excludes the sample value $$0.78$$.

2 * (1 - pnorm(abs(z.statistic)))
##  0.0009029052

We can do all this with prop.test():

success <- 1012
n <- 1290
p <- 0.82
prop.test(success, n=n, p=p, alternative="two.sided", conf.level=0.95, correct = F)
##
##  1-sample proportions test without continuity correction
##
## data:  success out of n, null probability p
## X-squared = 11.017, df = 1, p-value = 0.0009029
## alternative hypothesis: true p is not equal to 0.82
## 95 percent confidence interval:
##  0.7612313 0.8060716
## sample estimates:
##         p
## 0.7844961

Reference:

Categorical Data Analysis, Second Edition by Alan Agresti (p. 14)