4-9. Negative Hypergeometric Distribution (*)
1. Negative Hypergeometric Distribution
6개의 흰 공, 4개의 검은 공이 든 상자에서 공을 하나씩 꺼낼 때, 흰 공을 꺼내는 것을 '성공'이라고 한다면, 흰 공을 처음 꺼내기 전 검은 공을 꺼낸 횟수를 확률변수 라 할 때, 이 확률변수를 'Negative Hypergeometric random variable' 이라고 한다.
52장의 플레잉 카드를 고루 섞고, 한 장씩 나눠준다고 생각해보. 여기서 '에이스' 카드가 나오면 성공입이다. 만약 7번째에서 나눠준 카드가 '에이스'카드 였다면, 이 이전에 나눠줬던 6개의 카드들은 'Negative Hypergeometric Distribution'을 따르는 것이다.
One can view the Binomial distribution and Hypergeometric distribution as both considering a random variable that counts the number of successes in trials.
In the Binomial case, there are Bernoulli trials with constant probability of success from trial to trial, with independent trials.
In the Hypergeometric case, the sampling is without replacement so that probabilities change from selection to selection and trials are dependent. In the Bernoulli trials case, the Negative Binomial distribution is the distribution counting the number of trials required until a specified number (say ) of successes have been observed. In the sampling without replacement case, a similar situation is to consider the number of selections required until a success is obtained.
EXAMPLE 29. Suppose an urn contains 4 red and 10 blue balls and that balls are drawn one after another from this urn until a red ball is obtained. What is the probability that exactly six balls are drawn?
[ Solution ]
.
This example is like "waiting for the first success in sampling without replacement" with success being obtaining a red ball.
Recall that in sampling with replacement the distribution analogous to this was the Geometric distribution, a special case of the Negative Binomial distribution.
EXAMPLE 30. A statistics department has purchased 24 calculators of which 4 are defective. Calculators are selected one-after-another without replacement and tested. What is the probability that the second calculator found to be defective is the eighth calculator selected?
[ Solution ]
.
In the above two examples, the final selection must be a success. The selections before this one simply involve a Hypergeometric situation involving one fewer selection than the total number and one fewer success than the number for which the procedure is waiting.
Let be a random variable counting the number of selections required until the success is obtained when sampling without replacement from a set of objects of which have a certain attribute (i.e. success). Then is said to have a Negative Hypergeometric distribution with parameters and -- that is, -- and, for appropriate values , its probability function is
Negative Hypergeometric Distribution
where, the parameters and are non-negative integers which satisfy the condition .
Comments:
In the above expression, the quantity is just the Hypergeometric probability if (i.e. exactly successes in the first draws), and the second is the probability of another success on the draw based on what remains of the set of objects.
Again, it might be harder to try to remember the formula for the Negative Hypergeometric probability function than to simply solve the problem based on general knowledge of probability.
The value space of this random variable is .
The mean of this distribution is and its variance is .
EXAMPLE 31. A land developer has plans for having 86 acreages in its development south of the city. During the development of the acreages, water testing has suggested that 12 of the sites have water problems such that the wells on these sites do not have water that meets local drinking standards. If a potential purchaser decides to visit several of the acreage sites chosen at random from the 86, what is the probability that the third site that the purchaser visits that has such water problems is the eighth site visited? What is the expected number of sites that this purchaser would visit so as to have found three with such water problems?
[ Solution ]
Let be a random variable counting the number of sites visited up to and including the third one having these water problems. Then . The probability of having to visit 8 sites is . And .
2. Binomial, Hypergeometric, Negative Binomial and Negative Hypergeometric Distribution
지금까지 알아 봤던 1. 이항분포, 2. 초기하분포, 3. 음이항분포, 4. Negative Hypergeometric Distribution 를 한 번 생각해 보자.
2.1 Binomial Distribution and Hypergeometric Distribution
이항분포는 시행횟수(number of trials)가 정해져 있었다. 예를 들어, 시험문제 25문제를 찍어서 다 맞힐 확률처럼 '25'라는 시행횟수가 고정되어 있다.
초기하분포도 마찬가지 이다. 1,000개의 제품 중 20개를 뽑는 것처럼 '20'이라는 시행횟수를 정해놓고 문제를 푼다.
하지만, 이항분포와 초기하분포의 차이점은 '복원추출(with replacement)'이나 '비복원추출(without replacement)'이냐의 차이이다. 이항분포는 복원추출, 초기하분포는 비복원추출을 전제로 한다.
2.2 Negative Binomial Distribution and Negative Hypergeometric Distribution
음이항분포는 시행횟수를 정해 놓은 것이 아니라 '성공'의 횟수(number of successes)를 정해 놓고 문제를 푼다. 즉 월드시리즈에서 4번 먼저 승리하는 팀이 우승하는 것이 예가 된다.
Negative Hypergeometic Distribution 도 성공횟수를 정해놓습니다. 첫 번째 성공이 나오기 전까지 몇 번의 실패를 거듭해야 하는지를 알아보는 것이다.
이 내용을 표로 정리하면 다음과 같다.
With Replacement
Without Replacement
Fixed Number of Trials
이항 분포
(Binomial Distribution)
초기하 분포
(Hypergeometric Distribution)
Fixed Number of Successes
음이항 분포
(Negative Binomial Distribution)
Negative Hypergeometric
Distribution
Last updated