7-3. Large Sample Estimation of a Population Proportion

Since from Section 6.3 "The Sample Proportion" in Chapter 6 "Sampling Distributions" we know the mean, standard deviation, and sampling distribution of the sample proportion $\hat{p}$ , the ideas of the previous two sections can be applied to produce a confidence interval for a population proportion. Here is the formula.

Large Sample $100(1−α)\%$ Confidence Interval for a Population Proportion
$\hat{p}±z_{α∕2} \sqrt{ \frac{\hat{p}(1−\hat{p})}{n}}$
A sample is large if the interval lies wholly within the interval $[0,1]$ .

In actual practice the value of $p$ is not known, hence neither is . In that case we substitute the known quantity $\hat{p}$ for $p$ in making the check; this means checking that the interval

$[ \hat{p}-3 \sqrt{ \frac{\hat{p}(1−\hat{p})}{n}}, \hat{p}+3 \sqrt{ \frac{\hat{p}(1−\hat{p})}{n}}]$

lies wholly within the interval $[0,1]$ .

EXAMPLE 7. To estimate the proportion of students at a large college who are female, a random sample of 120 students is selected. There are 69 female students in the sample. Construct a 90% confidence interval for the proportion of all students at the college who are female.

[ Solution ]

The proportion of students in the sample who are female is $\hat {p}=69∕120=0.575$ .

Confidence level 90% means that $α=1−0.90=0.10$ , so $α∕2=0.05$ . From the last line of Figure 12.3 "Critical Values of " we obtain $z_{0.05}=1.645$ .

Thus $\hat{p}±z_{α∕2} \sqrt{ \hat p (1−\hat p) / n }=0.575±1.645 \sqrt{ (0.575)(0.425) / 120 }=0.575±0.074$ .

One may be 90% confident that the true proportion of all students at the college who are female is contained in the interval $(0.575−0.074,0.575+0.074)=(0.501,0.649)$ .

n <- 120                  # number of samples
p <- 69/120               # sample proportion
alpha <- 0.10             # 

se <- sqrt(p*(1-p)/n)     # std of sample proportion
z <- qnorm(1-alpha/2); z

ll <- p - z * se
ul <- p + z * se
ll   # lower limit
ul   # upper limit

> z <- qnorm(1-alpha/2); z
## [1] 1.644854
> 
> ll <- p - z * se
> ul <- p + z * se
> ll   # lower limit
## [1] 0.5007725
> ul   # upper limit
## [1] 0.6492275

Using Rstat Package : prob.ci()

library(Rstat)

prob.ci(n=120, x=69, alp=0.10, dig=3)

> prob.ci(n=120, x=69, alp=0.10, dig=3)
## [0.575 ± 1.645×√(0.575×0.425/120)] = [0.575 ± 0.074] = [0.501, 0.649]

Previous7-2. Exercises Next7-3. Exercises

Last updated 5 years ago

Was this helpful?