Motivating example:

The national proportion of people experiencing complications after having a particular operation in hospitals is \(20 \%.\) A hospital decides to take a sample of sized \(n=20\) from their records. Find the critial regions, at \(5\%\) level of significance, to test whether or not their proportion of complications differs from the national population. The probability in each tail should be as close to \(2.5\%\) as possible.

The standardized z value at this level of significance is

`qnorm(0.975)`

`## [1] 1.959964`

which corresponds to a proportion \(\hat p\) value to reject \(H_0\) of

\[\hat p = p \pm z \sqrt{\frac{p(1-q)}{n}}\]

or

```
p = 0.2
n = 20
p + c(-1,1) * qnorm(0.975) * sqrt((p * (1 - p)) / n)
```

`## [1] 0.02469549 0.37530451`

Second example:

Calculate the CI and margin of errors of proportion of university females based on the data

`survey`

within the package`MASS`

.

This is extracted from here:

```
library(MASS)
gender.response = na.omit(survey$Sex)
n = length(gender.response)
k = sum(gender.response == "Female")
pbar = k/n # Proportion of females among respondents
SE = sqrt((pbar * ( 1 - pbar)) / n) # Standard error
E = qnorm(1 - 0.05/2) * SE # Margin of error
CI = pbar + c(- E, E) # Confidence interval
paste('At',100*(1 - 0.05),'percent confidence level, between', 100 * round(pbar-E,3),
'percent and', 100 * round(pbar+E, 3),
'percent of the university students are female.')
```

`## [1] "At 95 percent confidence level, between 43.6 percent and 56.4 percent of the university students are female."`

`paste('The margin of error is', 100 * round(E,3),'percent.')`

`## [1] "The margin of error is 6.4 percent."`

equivalently, this can be done with

`prop.test(k, n, correct=F) `

```
##
## 1-sample proportions test without continuity correction
##
## data: k out of n, null probability 0.5
## X-squared = 0, df = 1, p-value = 1
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
## 0.4367215 0.5632785
## sample estimates:
## p
## 0.5
```

If we want to test whether a *one* sample proportion \(\hat p\) is consistent with a population parameter \(p\) the **score** test statistic is:

\[\text{test statistic} = \large \frac{\hat p - p}{\sqrt{\frac{p\,(1\,-\,p)}{n}}}\]

This is equivalent to the z test statistic for sample means, and does follow a Z distribution in large samples. Compare to the z test of proportions to compare two samples here.

The expression in the denominator is all under a squared root because the standard deviation of the Bernoulli distribution is \(\sqrt{p(1 - p)}.\)

Notice that since we are testing under the \(H_0\) how likely it is to get the \(\hat p\) we obtained, we use the population proportion \(p\) to calculate the standard error, as opposed to the sample proportion we would use to build the confidence interval with a Wald interval when the population parameter is not known.

Example:

We want to test if the proportion of side effects is greater than \(p=0.1\) for a new drug. In our sample of \(n=20\) cases, \(11\) had complications.

```
p <- 0.1
p.hat <- 11/22
n <- 20
paste('The z statistic is', z_stat <- (p.hat - p) / sqrt(p * (1 - p) / n))
```

`## [1] "The z statistic is 5.96284793999944"`

This is greater than

`qnorm(0.95) # One-sided test`

`## [1] 1.644854`

We can do all this with `prop.test()`

as explained here:

```
success <- 11
n <- 20
p <- 0.1
prop.test(success, n=n, p=p, alternative="greater", conf.level=0.95, correct = F)
```

```
## Warning in prop.test(success, n = n, p = p, alternative = "greater", conf.level
## = 0.95, : Chi-squared approximation may be incorrect
```

```
##
## 1-sample proportions test without continuity correction
##
## data: success out of n, null probability p
## X-squared = 45, df = 1, p-value = 9.852e-12
## alternative hypothesis: true p is greater than 0.1
## 95 percent confidence interval:
## 0.3722077 1.0000000
## sample estimates:
## p
## 0.55
```

Notice that the CI does not include the population probability \(p=0.1.\) We reject the null hypothesis.

Alternatively, we can just run an exact binomial test, which doesn’t need to rely on the normal approximation:

\[\Pr(\text{counts} \geq 11)= \displaystyle \sum_{11}^{20} {20\choose X} \,0.1^X\, 0.9^{20-X}\]

With [R],

`pbinom(10, 20, .1, lower.tail = F) # Lower.tail false implies that it will count 11 and above.`

`## [1] 7.088606e-07`

Alternatively (same result),

`binom.test(11, 20, 0.1, alternative = "greater")`

```
##
## Exact binomial test
##
## data: 11 and 20
## number of successes = 11, number of trials = 20, p-value = 7.089e-07
## alternative hypothesis: true probability of success is greater than 0.1
## 95 percent confidence interval:
## 0.3469314 1.0000000
## sample estimates:
## probability of success
## 0.55
```

For two-sided tests calculate both one-sided tests and double the smallest p-value.

Example:

This question appeared here.

A report says that \(82\%\) of British Columbians over the age of \(25\) are high school graduates. A survey of randomly selected residents of a certain city included \(1290\) who were over the age of \(25\), and \(1012\) of them were high school graduates. Is the city’s result of \(1012\) unusually high, low, or neither?

First off let me get the algebraic nomenclature out of the way - I find this extremely slippery and often implied:

\(\pi_0\) is a reference value assumed to be true. It is not necessarily the population proportion, but rather a

*fixed*fraction or proportion to which we compare the sample to. For instance, the problem reads something along the lines: “Is our sample consistent with a population proportion of \(\small \pi_0 = 0.7\)?”\(\pi\) (or \(p\)) stands for the actual population proportion, but it’s too bad that we usually don’t know it and have to use instead the…

\(\hat \pi\) (or \(\hat p\)), which stands for the sample proportion. To make things more “friendly” sometimes \(p\) denotes the sample proportion…

\(n\) is the number of trials in the binomial experiment (or the number of sampled subjects in a poll).

\(\small Y\) number of “successes” (“success” interpreted sometimes like the word “positive” in Medicine - you don’t necessarily want it for yourself).

In our case we have \(\small \pi = 0.82\) and \(\small \hat \pi = 1012/1290 = 0.78\). And \(\small n = 1290\).

The MLE of \(\pi\) is the sample proportion, \(\hat \pi = \small \text{successes/trials}\), and the expectation for the number of \(\small\text{successes}\) is \(n\pi\). The sample proportion is an unbiased estimator of the population: \(\small E(\hat \pi)=\pi\) (and \(\small E(Y)=n\pi\)) and the standard error behaves very similarly to that of sampling distributions of sample means: \(\small SE\,(\hat \pi) = \sqrt{\frac{\pi(1-\pi)}{n}}\), remembering that \(\text{var}(\hat \pi)= \pi(1-\pi)\) (and \(\text{var}(Y)=n\pi(1-\pi)\)).

The test here is a two-sided one-sample proportion test: \(H_0: \pi = \pi_0\) versus \(H_A: \pi \neq \pi_0\). Typically, a normal approximation with mean \(\pi\) and \(\text{var} = \pi(1-\pi)/n\) under the following conditions: \(n\hat\pi>5\) and \(n(1-\hat\pi)>5\). In our case this is clearly met (\(\small 1290 * 0.78 = 1006\)).

The \(z\) test statistic is

\[z=\Large\frac{\hat\pi-\pi_0}{\sqrt{\frac{\pi_0\,(1-\pi_0)}{n}}}\] In our case, \(z =\large \frac{0.78 - 0.82}{\sqrt{\frac{0.82(1-0.82)}{1290}}}=\large \frac{-0.04}{\sqrt{\frac{0.15}{1290}}}= \small-3.32\)

```
p <- 0.82
n <- 1290
p.hat <- 1012/n
paste('The z-statistic is', z.statistic <- (p.hat - p) / sqrt((p * (1 - p))/n))
```

`## [1] "The z-statistic is -3.31915431422705"`

which is clearly significant since

`c(-1,1) * qnorm(1 - 0.05/2)`

`## [1] -1.959964 1.959964`

This latter expression corresponding to the [R] code for the two-tailed cut-off quantile values fixing the alpha significance level at 5%: \(z_{(1 - \alpha/2)}\) where \(\small \alpha = 0.05\).

As for the Wald confidence intervals, the calculation is:

\(\large \hat \pi \pm z_{(1-\alpha/2)}\,\sqrt{\frac{\hat\pi(1-\hat\pi)}{n}}\). Coded in [R]:

`p.hat + c(-1,1) * qnorm(1 - 0.05/2) * sqrt((p.hat * (1 - p.hat)) / n)`

`## [1] 0.7620585 0.8069337`

which does not include \(\small \pi_0 = 0.82\).

Since we know the population proportion \(\pi\) in this case as being \(0.82\) we can use it to construct the confidence interval as:

`p + c(-1,1) * qnorm(1 - 0.05/2) * sqrt((p * (1 - p)) / n)`

`## [1] 0.7990349 0.8409651`

This excludes the sample value \(0.78\).

The p-value of this two-sided test is

`2 * (1 - pnorm(abs(z.statistic)))`

`## [1] 0.0009029052`

We can do all this with `prop.test()`

:

```
success <- 1012
n <- 1290
p <- 0.82
prop.test(success, n=n, p=p, alternative="two.sided", conf.level=0.95, correct = F)
```

```
##
## 1-sample proportions test without continuity correction
##
## data: success out of n, null probability p
## X-squared = 11.017, df = 1, p-value = 0.0009029
## alternative hypothesis: true p is not equal to 0.82
## 95 percent confidence interval:
## 0.7612313 0.8060716
## sample estimates:
## p
## 0.7844961
```

Reference:

*Categorical Data Analysis*, Second Edition by Alan Agresti (p. 14)