7-3. Large Sample Estimation of a Population Proportion

Since from Section 6.3 "The Sample Proportion" in Chapter 6 "Sampling Distributions" we know the mean, standard deviation, and sampling distribution of the sample proportion p^\hat{p} , the ideas of the previous two sections can be applied to produce a confidence interval for a population proportion. Here is the formula.

Large Sample 100(1−α)%100(1−α)\% Confidence Interval for a Population Proportion

p^±zα∕2p^(1−p^)n\hat{p}±z_{α∕2} \sqrt{ \frac{\hat{p}(1−\hat{p})}{n}}

A sample is large if the interval lies wholly within the interval [0,1][0,1] .

In actual practice the value of pp is not known, hence neither is . In that case we substitute the known quantity p^\hat{p} for pp in making the check; this means checking that the interval

[p^−3p^(1−p^)n,p^+3p^(1−p^)n][ \hat{p}-3 \sqrt{ \frac{\hat{p}(1−\hat{p})}{n}}, \hat{p}+3 \sqrt{ \frac{\hat{p}(1−\hat{p})}{n}}]

lies wholly within the interval [0,1][0,1] .

EXAMPLE 7. To estimate the proportion of students at a large college who are female, a random sample of 120 students is selected. There are 69 female students in the sample. Construct a 90% confidence interval for the proportion of all students at the college who are female.

[ Solution ]

The proportion of students in the sample who are female is p^=69∕120=0.575\hat {p}=69∕120=0.575 .

Confidence level 90% means that α=1−0.90=0.10α=1−0.90=0.10, so α∕2=0.05α∕2=0.05 . From the last line of Figure 12.3 "Critical Values of " we obtain z0.05=1.645z_{0.05}=1.645 .

Thus p^±zα∕2p^(1−p^)/n=0.575±1.645(0.575)(0.425)/120=0.575±0.074\hat{p}±z_{α∕2} \sqrt{ \hat p (1−\hat p) / n }=0.575±1.645 \sqrt{ (0.575)(0.425) / 120 }=0.575±0.074 .

One may be 90% confident that the true proportion of all students at the college who are female is contained in the interval (0.575−0.074,0.575+0.074)=(0.501,0.649)(0.575−0.074,0.575+0.074)=(0.501,0.649) .

n <- 120                  # number of samples
p <- 69/120               # sample proportion
alpha <- 0.10             # 

se <- sqrt(p*(1-p)/n)     # std of sample proportion
z <- qnorm(1-alpha/2); z

ll <- p - z * se
ul <- p + z * se
ll   # lower limit
ul   # upper limit
  • Using Rstat Package : prob.ci()

library(Rstat)

prob.ci(n=120, x=69, alp=0.10, dig=3)

Last updated