11-2. Exercises

1. BASIC

Ex 1. A data sample is sorted into five categories with an assumed probability distribution.

Factor Levels

Assumed Distribution

Observed Frequency

1

p1=0.1p1=0.1

10

2

p2=0.4p2=0.4

35

3

p3=0.4p3=0.4

45

4

p4=0.1p4=0.1

10

  1. Find the size nn of the sample.

  2. Find the expected number EE of observations for each level, if the sampled population has a probability distribution as assumed (that is, just use the formula Ei=n×piE_i=n×p_i ).

  3. Find the chi-square test statistic χ2χ^2 .

  4. Find the number of degrees of freedom of the chi-square test statistic.

Ex 2. A data sample is sorted into five categories with an assumed probability distribution.

Factor Levels

Assumed Distribution

Observed Frequency

1

p1=0.3p1=0.3

23

2

p2=0.3p2=0.3

30

3

p3=0.2p3=0.2

19

4

p4=0.1p4=0.1

8

5

p5=0.1p5=0.1

10

  1. Find the size nn of the sample.

  2. Find the expected number E of observations for each level, if the sampled population has a probability distribution as assumed (that is, just use the formula Ei=n×piE_i=n×p_i).

  3. Find the chi-square test statistic χ2χ^2.

  4. Find the number of degrees of freedom of the chi-square test statistic.

2. APPLICATIONS

Ex 3. Retailers of collectible postage stamps often buy their stamps in large quantities by weight at auctions. The prices the retailers are willing to pay depend on how old the postage stamps are. Many collectible postage stamps at auctions are described by the proportions of stamps issued at various periods in the past. Generally the older the stamps the higher the value. At one particular auction, a lot of collectible stamps is advertised to have the age distribution given in the table provided. A retail buyer took a sample of 73 stamps from the lot and sorted them by age. The results are given in the table provided. Test, at the 5% level of significance, whether there is sufficient evidence in the data to conclude that the age distribution of the lot is different from what was claimed by the seller.

Year

Claimed Distribution

Observed Frequency

Before 1940

0.10

6

1940 to 1959

0.25

15

1960 to 1979

0.45

30

After 1979

0.20

22

Ex 4. The litter size of Bengal tigers is typically two or three cubs, but it can vary between one and four. Based on long-term observations, the litter size of Bengal tigers in the wild has the distribution given in the table provided. A zoologist believes that Bengal tigers in captivity tend to have different (possibly smaller) litter sizes from those in the wild. To verify this belief, the zoologist searched all data sources and found 316 litter size records of Bengal tigers in captivity. The results are given in the table provided. Test, at the 5% level of significance, whether there is sufficient evidence in the data to conclude that the distribution of litter sizes in captivity differs from that in the wild.

Litter Size

Wild Litter Distribution

Observed Frequency

1

0.11

41

2

0.69

243

3

0.18

27

4

0.02

5

Ex 5. An online shoe retailer sells men’s shoes in sizes 8 to 13. In the past orders for the different shoe sizes have followed the distribution given in the table provided. The management believes that recent marketing efforts may have expanded their customer base and, as a result, there may be a shift in the size distribution for future orders. To have a better understanding of its future sales, the shoe seller examined 1,040 sales records of recent orders and noted the sizes of the shoes ordered. The results are given in the table provided. Test, at the 1% level of significance, whether there is sufficient evidence in the data to conclude that the shoe size distribution of future sales will differ from the historic one.

Shoe Size

Past Size Distribution

Recent Size Frequency

8.0

0.03

25

8.5

0.06

43

9.0

0.09

88

9.5

0.19

221

10.0

0.23

272

10.5

0.14

150

11.0

0.10

107

11.5

0.06

51

12.0

0.05

37

12.5

0.03

35

13.0

0.02

11

Ex 6. An online shoe retailer sells women’s shoes in sizes 5 to 10. In the past orders for the different shoe sizes have followed the distribution given in the table provided. The management believes that recent marketing efforts may have expanded their customer base and, as a result, there may be a shift in the size distribution for future orders. To have a better understanding of its future sales, the shoe seller examined 1,174 sales records of recent orders and noted the sizes of the shoes ordered. The results are given in the table provided. Test, at the 1% level of significance, whether there is sufficient evidence in the data to conclude that the shoe size distribution of future sales will differ from the historic one.

Shoe Size

Past Size Distribution

Recent Size Frequency

5.0

0.02

20

5.5

0.03

23

6.0

0.07

88

6.5

0.08

90

7.0

0.20

222

7.5

0.20

258

8.0

0.15

177

8.5

0.11

121

9.0

0.08

91

9.5

0.04

53

10.0

0.02

31

Ex 7. A chess opening is a sequence of moves at the beginning of a chess game. There are many well-studied named openings in chess literature. French Defense is one of the most popular openings for black, although it is considered a relatively weak opening since it gives black probability 0.344 of winning, probability 0.405 of losing, and probability 0.251 of drawing. A chess master believes that he has discovered a new variation of French Defense that may alter the probability distribution of the outcome of the game. In his many Internet chess games in the last two years, he was able to apply the new variation in 77 games. The wins, losses, and draws in the 77 games are given in the table provided. Test, at the 5% level of significance, whether there is sufficient evidence in the data to conclude that the newly discovered variation of French Defense alters the probability distribution of the result of the game.

Result for Black

Probability Distribution

New Variation Wins

Win

0.344

31

Loss

0.405

25

Draw

0.251

21

Ex 8. The Department of Parks and Wildlife stocks a large lake with fish every six years. It is determined that a healthy diversity of fish in the lake should consist of 10% largemouth bass, 15% smallmouth bass, 10% striped bass, 10% trout, and 20% catfish. Therefore each time the lake is stocked, the fish population in the lake is restored to maintain that particular distribution. Every three years, the department conducts a study to see whether the distribution of the fish in the lake has shifted away from the target proportions. In one particular year, a research group from the department observed a sample of 292 fish from the lake with the results given in the table provided. Test, at the 5% level of significance, whether there is sufficient evidence in the data to conclude that the fish population distribution has shifted since the last stocking.

Fish

Target Distribution

Fish in Sample

Largemouth Bass

0.10

14

Smallmouth Bass

0.15

49

Striped Bass

0.10

21

Trout

0.10

22

Catfish

0.20

75

Other

0.35

111

3. LARGE DATA SET EXERCISE

Ex 9. Large Data Set 4 records the result of 500 tosses of six-sided die. Test, at the 10% level of significance, whether there is sufficient evidence in the data to conclude that the die is not “fair” (or “balanced”), that is, that the probability distribution differs from probability 1/6 for each of the six faces on the die.

Last updated