4-6. Hypergeometric Distribution
1. Hypergeometric Distribution
Hypergeometric distribution, in statistics, distribution function in which selections are made from two groups without replacing members of the groups. The hypergeometric distribution differs from the binomial distribution in the lack of replacements. Thus, it often is employed in random sampling for statistical quality control. A simple everyday example would be the random selection of members for a team from a population of girls and boys.
In symbols, let the size of the population selected from be N , with r elements of the population belonging to one group (for convenience, called successes) and (Nโr) belonging to the other group (called failures). Further, let the number of samples drawn from the population be n , such that 0โคnโคN . Then the probability ( P ) that the number ( X ) of elements drawn from the successful group is equal to some number ( x ) is given by
P(X=x)=(nNโ)(xrโ)(nโxNโrโ)โ :XโผHG(n,N,r)
using the notation of binomial coefficients, or, using factorial notation,
P(X=x)=N!x!(rโx)!(nโx)!(Nโrโn+x)!n!r!(Nโn)!(Nโr)!โ (hypergeometric factorial formula)
2. Mean and Variance
The mean of the hypergeometric distribution is ฮผ=nโP(success)=nNrโ ,
and the variance (square of the standard deviation) is ฯ=N2(Nโ1)nr(Nโr)(Nโn)โ .
Var(X)=np(1โp)Nโ1Nโnโ , if p=nrโ
EXAMPLE 15. A batch of 100 piston rings is known to contain 10 defective rings. If two piston rings are drawn from the batch, write down the probabilities that:
the first ring is defective;
the second ring is defective given that the first one is defective.
[ Solution ]
10/ 100 = 1/10
9/99 = 1/11
EAXMPLE 16. A batch of 10 rocker cover gaskets contains 4 defective gaskets. If we draw samples of size 3 without replacement, from the batch of 10, find the probability that a sample contains 2 defective gaskets. And Find the expectation and variance of samples.
[ Solution ]
P(X=x)=(nNโ)(xrโ)(nโxNโrโ)โ, N=10,n=3,r=4ย andย x=2 => P(X=2)=10โC3โ4โC2โโ6โC1โโ=1206โ6โ=0.3

EXAMPLE 17. In the manufacture of car tyres, a particular production process is know to yield 10 tyres with defective walls in every batch of 100 tyres produced. From a production batch of 100 tyres, a sample of 4 is selected for testing to destruction. Find:
the probability that the sample contains 1 defective tyre
the expectation of the number of defectives in samples of size 4
the variance of the number of defectives in samples of size 4.
[ Solution ]
P(X=x)=(nNโ)(xrโ)(nโxNโrโ)โ, N=100,n=4,r=10ย andย x=1
P(X=1)=100โC4โ10โC1โโ(100โ10)โC(4โ1)โโ=392122510โ117480โโ0.3
E(X)=np=4โ0.1=0.4
V(X)=np(1โp)Nโ1NโMโ=0.4โ0.9โ9990โโ0.33

3. Using R
์ด๊ธฐํ๋ถํฌ์ ๋ฐ๋ ํจ์, ๋์ ๋ถํฌ ํจ์, ๋ถ์์ ํจ์, ๋์ ๋ฐ์์ ์ํ R ํจ์ ๋ฐ ๋ชจ์๋ ์๋์ ๊ฐ์ต๋๋ค.
๊ตฌ๋ถ
์ด๊ธฐํ๋ถํฌ R ํจ์ / ๋ชจ์
๋ฐ๋ ํจ์
d
dhyper(x, m, n, k)
๋์ ๋ถํฌ ํจ์
p
phyper(q, m, n, k, lower.tail = TRUE/FALSE)
๋ถ์์ ํจ์
q
qhyper(p, m, n, k, lower.tail = TRUE/FALSE)
๋์ ๋ฐ์
r
rhyper(nn, m, n, k)
์ฐธ๊ณ : ๋ชจ์ง๋จ์ด m๊ณผ n์ ๊ฐ์ฒด๋ก ๊ตฌ์ฑ๋์ด ์๋๋ฐ k๊ฐ์ ํ๋ณธ์ ์ถ์ถ. lower.tail = TRUE ์ด๋ฉด ํ๋ฅ ๋ณ์ x๋ฅผ ๊ธฐ์ค์ผ๋ก ์ผ์ชฝ ๊ผฌ๋ฆฌ๋ฅผ ์๋ฏธ
3-1. Random Number Generation & Plotting
Random Number Generation :
rhyper(nn, m, n, k)Plotting :
dhyper(x, m, n, k)
m=5, n=20 ์ธ ์ด๊ธฐํ๋ถํฌ์์ ๋น๋ณต์์ผ๋ก 4๊ฐ๋ฅผ ์ถ์ถํ๋ ๊ฒ์ 1000๋ฒ ๋ชจ์์คํํ ํ์ ๋์๋ถํฌํ๋ฅผ ๊ตฌํด๋ณด๊ฒ ์ต๋๋ค.

3-2. Probability Computation
1) P(X=4) ํ๋ฅ ๊ณ์ฐ : dhyper(x, m, n, k)
EXAMPLE 18. ์ด๋ค ๋ฐ๋ฆฌ์คํ๊ฐ ์๋ฉ๋ฆฌ์นด๋ ธ ํฅ ๋์๋ฅผ ๋งก์๋ณด๊ธฐ๋ง ํ๋ฉด "์ฝ๋กฌ๋น์ ์๋"๋ก ๋ง๋ ๊ฒ์ธ์ง ์๋์ง๋ฅผ ๋ง์ถ ์ ์๋ค๊ณ ์ฃผ์ฅํ์๋ค๊ณ ํฉ๋๋ค. ๊ทธ๋์ ๊ทธ ๋ฐ๋ฆฌ์คํ๋ฅผ ๋ฐ๋ ค๋ค๊ฐ ์คํ์ ํด๋ณด์์ต๋๋ค. "์ฝ๋กฌ๋น์ ์๋"๋ก ๋ง๋ ์๋ฉ๋ฆฌ์นด๋ ธ 5์ (m=5), ์ฝ๋กฌ๋น์ ์๋ ๋ง๊ณ ๋ค๋ฅธ ์ง์ญ ์๋๋ก ๋ง๋ ์๋ฉ๋ฆฌ์นด๋ ธ 20์ (n=20) ์ ๋ง๋ค์ด ๋๊ณ ๊ทธ ๋ฐ๋ฆฌ์คํ์๊ฒ "์ฝ๋กฌ๋น์ ์๋"๋ก ๋ง๋ ์๋ฉ๋ฆฌ์นด๋ ธ 5์(k)์ ๊ณจ๋ผ๋ด ๋ณด๋ผ๊ณ ์์ผฐ์ต๋๋ค. ์ด๋ "์ฝ๋กฌ๋น์ ์๋"๋ก ๋ง๋ ์๋ฉ๋ฆฌ์นด๋ ธ๋ฅผ 4์(x) ๊ณจ๋ผ๋ผ ํ๋ฅ ์?
[ Solution ]
m : "์ฝ๋กฌ๋น์ ์๋"๋ก ๋ง๋ ์๋ฉ๋ฆฌ์นด๋ ธ 5์ (์ํ๋ ๊ฒฐ๊ณผ ๋์)
n : ๋ค๋ฅธ ์ง์ญ ์๋๋ก ๋ง๋ ์๋ฉ๋ฆฌ์นด๋ ธ 20์ (์ํ์ง ์๋ ๊ฒฐ๊ณผ ๋์)
k : ๊ณจ๋ผ๋ด๋ ์ปคํผ 5์ (์ํํ์)
x : ์ํ๋ ๊ฒฐ๊ณผ์ ํ์ (4์)
=> P(X=4)=dhyper(x=4,m=5,n=20,k=5)
EXAMPLE 19. TV๋ฅผ ์์ฐํ๋ ์ ์กฐํ์ฌ์์ ์์ฐํ TV 100 ๋ ์ค์์ ํ์ง์ด ์ํธํ TV๊ฐ 95๋, ๋ถ๋ํ์ด 5๋๊ฐ ์ฌ๊ณ ์ฐฝ๊ณ ์ ๋ค์ด์๋ค๊ณ ํฉ๋๋ค. ์ด ์ฌ๊ณ ์ฐฝ๊ณ ์์ TV 10๊ฐ๋ฅผ ๋น๋ณต์์ถ์ถํ๋ค๊ณ ํ์ ๋ ๋ถ๋ํ์ด 3๊ฐ๊ฐ ํฌํจ๋์ด ์์ ํ๋ฅ ์?
[Solution]
m : ๋ถ๋ํ์ ๋์ 5๋
n : ์ํธํ TV ๋์ 95๋
k : 10๋ ๋น๋ณต์์ถ์ถ
x : ๋ถ๋ํ์ด 3๋
=> dhyper(3,m=5,n=95,k=10)
2) P(X<=4)
phyper(x, m, n, k, lower.tail=TRUE): lower.tail=TRUE ์ฌ์ฉ
EXAMPLE 20. EXAMPLE 18.์์ 4์ ์ดํ์ผ ํ๋ฅ ์ ๊ตฌํ๋ผ.
[ Solution ]
=> phyper(4,m=5,n=20,k=5,lower.tail=TRUE)
๋๋ P(X<=4)=P(X=1)+P(X=2)+P(X=3)+P(X=4)
3-3. ํน์ ํ๋ฅ ์ ํด๋นํ๋ ๋ถ์์ ๊ตฌํ๊ธฐ
qhyper(p, m, n, k, lower.tail = TRUE/FALSE)
EXAMPLE 21. EXAMPLE 18. ์์ ํ๋ฅ ์ด 0.03576134๊ฐ ๋๋ ์ํํ์๋ฅผ ๊ตฌํ๋ผ.
[ Solution ]
EXAMPLE 22. ๋์ ํ๋ฅ ์ด 0.998099๊ฐ ๋๋ ์ํํ์๋ฅผ ๊ตฌํ๋ผ.
[ Solution ]
EXAMPLE 23. ์ด 50๊ฐ์ ๊ฐ์ฒด๋ก ๊ตฌ์ฑ๋๋ฉฐ, ๊ฐ๊ฐ 10๊ฐ, 20๊ฐ, 40๊ฐ์ ์ฑ๊ณต๊ฐ์ฒด๊ฐ ์๋ ์ธ ์ข ๋ฅ์ ์ ํ๋ชจ์ง๋จ์์ 10๊ฐ์ ํ๋ณธ์ ์ทจํ์์ ๋, ์ฑ๊ณต๊ฐ์์ ํ๋ฅ ๋ถํฌ๋ฅผ ๊ตฌํ์ฌ ๋น๊ตํ๋ผ.
[ Solution ]

EXAMPLE 24. ๋ถ๋๋ฅ ์ด 5%์ด๊ณ 1,000๊ฐ์ ์ ํ์ผ๋ก ๊ตฌ์ฑ๋ Lot์์ 30๊ฐ์ ํ๋ณธ์ ์ถ์ถํ์์ ๋ ๋์ค๋ ๋ถ๋ํ์ ๊ฐฏ์๋ฅผ X ๋ผ ํ ๋, ๋ค์์ ๊ตฌํ์์ค.
X์ ํ๋ฅ ๋ถํฌํจ์
E(X) ์ Var(X)
P(X=3)
P(Xโค3)
[ Solution ]
P(X=3)
P(Xโค3)=P(X=0)+P(X=1)+P(X=2)+P(X=3)=0.942

4. Binomial Distribution and Hypergeometric Distribution
4.1 Hypergeometric Distribution from Binomial Distribution
์ดํญ ๋ถํฌ์ ์กฐ๊ฑด๋ถ ๋ถํฌ๊ฐ ๋ฐ๋ก ์ด๊ธฐํ ๋ถํฌ๊ฐ ๋๋ ๊ฒ์ ์ ์ ์์ต๋๋ค.
๋ ํ๋ฅ ๋ณ์ X์ Y๊ฐ ์๋ก ๋ ๋ฆฝ์ด๊ณ , ์ดํญ๋ถํฌ๋ฅผ ๋ฐ๋ฅผ ๋, ํ๋ฅ ๋ณ์ X+Y์ ๋ํ X์ ์กฐ๊ฑด๋ถ ๋ถํฌ๋ '์ด๊ธฐํ ๋ถํฌ'๋ฅผ ๋ฐ๋ฅธ๋ค.
4.2 Binomial Distribution from Hypergeometric Distribution
์ด์ ๋ฐ๋๋ก ์ด๊ธฐํ ๋ถํฌ์ ๊ทนํ (N โ โ)์ ์ทจํ๋ฉด ์ดํญ๋ถํฌ๊ฐ ๋ฉ๋๋ค.
์ด๊ธฐํ ๋ถํฌ์ ์ฑ๊ณตํ๋ฅ ์ p=Nrโ ์ด๋ผ ํ ๋, ์ด๊ธฐํ ๋ถํฌ์ ํ๋ฅ ์ง๋ํจ์
P(X=x)=(nNโ)(xrโ)(nโxNโrโ)โ=NโCnโrโCxโร(Nโr)โC(nโx)โโ=NโCrโnโCxโร(Nโn)โC(rโx)โโ =nโCxโr!(Nโr)!N!โ(rโx)!((Nโr)โn+x)!(Nโn)!โโ=nโCxโN!(rโx)!((Nโr)โn+x)!r!(Nโr)!(Nโn)!โ =nโCxโN(Nโ1)(Nโ2)โฏ(Nโn+1)r(rโ1)โฏ(rโx+1)(Nโr)(Nโrโ1)โฏ(Nโrโn+x+1)โ
๋ถ๋ชจ, ๋ถ์๋ฅผ Nn ์ผ๋ก ๋๋๋ฉด,
=nโCxโร(1โN1โ)(1โN2โ)โฏ(1โNnโ1โ)Nrโ(NrโโN1โ)โฏ(NrโโNxโ1โ)(1โNrโ)(1โNrโโN1โ)โฏ(1โNrโโNnโxโ1โ)โ
p=Nrโ์ด๊ณ , 1โp=q ์ด๋ฏ๋ก,
=nโCxโร(1โN1โ)(1โN2โ)โฏ(1โNnโ1โ)p(pโN1โ)โฏ(pโNxโ1โ)q(qโN1โ)โฏ(qโNnโxโ1โ)โ
AsNโโ , P(X=x)โnโCxโpxqnโxโBinomialDistribution
์ด๋ ๊ฒ ์ด๊ธฐํ ๋ถํฌ์ ๊ทนํ์ ์ทจํ์ ๊ฒฝ์ฐ ์ดํญ ๋ถํฌ๊ฐ ๋จ์ ๋ณด์์ต๋๋ค.
4.3 Binomial Distribution and Hypergeometric Distribution
๋ฐ๋ผ์ ์ดํญ ๋ถํฌ์ ์ด๊ธฐํ ๋ถํฌ๋ ๋ค์๊ณผ ๊ฐ์ ๊ด๊ณ๋ฅผ ๊ฐ์ง๋๋ค.

๊ทธ๋ฆฌ๊ณ ํ ๊ฐ์ง ๋ ์์๋์ ์ผ ํ ์ฌํญ์ ์ดํญ ๋ถํฌ๋ '๋ณต์ ์ถ์ถ'์ ์ ์ ๋ก, ์ด๊ธฐํ ๋ถํฌ๋ '๋น๋ณต์ ์ถ์ถ'์ ์ ์ ๋ก ํ๋ค๋ ๊ฒ ์ ๋๋ค.
Last updated