6-2. The Sampling Distribution of the Sample Mean
样本均值的抽样分布
Last updated
样本均值的抽样分布
Last updated
We constructed the probability distribution of the sample mean for samples of size two drawn from the population of four rowers in Section 6.1 "The Mean and Standard Deviation of the Sample Mean" .
The probability distribution is:
152 154 156 158 160 162 164
The Figure shows a side-by-side comparison of a histogram for the original population and a histogram for this distribution. Whereas the distribution of the population is uniform, the sampling distribution of the mean has a shape approaching the shape of the familiar bell curve.
This phenomenon of the sampling distribution of the mean taking on a bell shape even though the population distribution is not bell-shaped happens in general. Here is a somewhat more realistic example.
Suppose we take samples of size 1, 5, 10, or 20 from a population that consists entirely of the numbers 0 and 1, half the population 0, half 1, so that the population mean is 0.5. The sampling distributions are:
Histograms illustrating these distributions are shown as follows
What we are seeing in these examples does not depend on the particular population distributions involved. In general, one may start with any distribution and the sampling distribution of the sample mean will increasingly resemble the bell-shaped normal curve as the sample size increases. This is the content of the Central Limit Theorem.
The Central Limit Theorem is illustrated for several common population distributions in Figure "Distribution of Populations and Sample Means".
The dashed vertical lines in the figures locate the population mean.
Regardless of the distribution of the population, as the sample size is increased the shape of the sampling distribution of the sample mean becomes increasingly bell-shaped, centered on the population mean. Typically by the time the sample size is 30 the distribution of the sample mean is practically the same as a normal distribution.
The importance of the Central Limit Theorem is that it allows us to make probability statements about the sample mean, specifically in relation to its value in comparison to the population mean, as we will see in the examples. But to use the result properly we must first realize that there are two separate random variables (and therefore two probability distributions) at play:
[ Solution ]
EXAMPLE 4. The numerical population of grade point averages at a college has mean 2.61 and standard deviation 0.5. If a random sample of size 100 is taken from the population, what is the probability that the sample mean will be between 2.51 and 2.71?
[ Solution ]
The Central Limit Theorem says that no matter what the distribution of the population is, as long as the sample is “large,” meaning of size 30 or more, the sample mean is approximately normally distributed. If the population is normal to begin with then the sample mean also has a normal distribution, regardless of the sample size.
For samples of any size drawn from a normally distributed population, the sample mean is normally distributed, with mean \mu_\bar{X} = \mu and standard deviation σ_\bar{X} = \frac{σ} {\sqrt{n}} , where n is the sample size.
The effect of increasing the sample size is shown in Figure "Distribution of Sample Means for a Normal Population".
EXAMPLE 5. A prototype automotive tire has a design life of 38,500 miles with a standard deviation of 2,500 miles. Five such tires are manufactured and tested. On the assumption that the actual population mean is 38,500 miles and the actual population standard deviation is 2,500 miles, find the probability that the sample mean will be less than 36,000 miles. Assume that the distribution of lifetimes of such tires is normal.
[ Solution ]
That is, if the tires perform as designed, there is only about a 1.25% chance that the average of a sample of this size would be so low.
EXAMPLE 6. An automobile battery manufacturer claims that its midgrade battery has a mean life of 50 months with a standard deviation of 6 months. Suppose the distribution of battery lives of this particular brand is approximately normal.
On the assumption that the manufacturer’s claims are true, find the probability that a randomly selected battery of this type will last less than 48 months.
On the same assumption, find the probability that the mean of a random sample of 36 such batteries will be less than 48 months.
[ Solution ]
样本均值 Sample Mean 中心极限定理 Central Limit Theorem 正态分布群体 Normally Distributed Populations
0 1
0 0.2 0.4 0.6 0.8 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1
As increases the sampling distribution of evolves in an interesting way: the probabilities on the lower and the upper ends shrink and the probabilities in the middle become larger in relation to them. If we were to continue to increase then the shape of the sampling distribution would become smoother and more bell-shaped.
For samples of size 30 or more, the sample mean is approximately normally distributed, with mean μ_\bar{X} = μ and standard deviation σ_\bar{X} = \frac{σ} {\sqrt{n}} , where is the sample size. The larger the sample size, the better the approximation.
, the measurement of a single element selected at random from the population; the distribution of is the distribution of the population, with mean the population mean and standard deviation the population standard deviation ;
, the mean of the measurements in a sample of size ; the distribution of is its sampling distribution, with mean \mu_\bar{X} = \mu and standard deviation σ_\bar{X} = \frac{σ} {\sqrt{n}}
EXAMPLE 3. Let be the mean of a random sample of size 50 drawn from a population with mean 112 and standard deviation 40.
Find the mean and standard deviation of .
Find the probability that assumes a value between 110 and 114.
Find the probability that assumes a value greater than 113.
\mu_\bar{X} = \mu = 112 , σ_\bar{X} = \frac{σ} {\sqrt{n}} 5.65685
P(110 < \bar{X} < 114) = P( \frac{110−μ_\hat{X} } {σ_\hat{X}}<Z< \frac{114−μ_\hat{X} } {σ_\hat{X}})
P(\bar{X}>113) = P( Z > \frac{113−μ_\hat{X} } {σ_\hat{X}})
The sample mean has mean μ_\hat{X} =μ=2.61 and standard deviation σ_\hat{X}=σ/ \sqrt{n} = 0.5/\sqrt{100}=0.05 , so P(2.51<\bar{X}<2.71)=P( \frac{2.51−μ_\hat{X} } {σ_\hat{X}}<Z< \frac{2.71−μ_\hat{X} } {σ_\hat{X}})
For simplicity we use units of thousands of miles. Then the sample mean has mean μ_\bar{X}=μ=38.5 and standard deviation . Since the population is normally distributed, so is , hence P(\bar{X}<36)=P(Z<\frac{36−μ_\bar{X}}{σ_\bar{X}})=P(Z<\frac{36−38.5}{1.11803})=P(Z<−2.24)=0.0125
μ_\bar{X}=μ=50, P(\bar{X} < 48) =P(Z<\frac{48−μ_\bar{X}}{σ_\bar{X}})=P(Z<\frac{48−50}{1})=P(Z<−2)=0.0228