2-3. Measures of Variability (Statistical Dispersion)
离散程度的统计量
Last updated
Was this helpful?
离散程度的统计量
Last updated
Was this helpful?
다음의 두 데이터 세트가 있다. 각각의 데이터 세트를 점으로 표현한 것이 dot plot이다.
Data Set 1
40
38
42
40
39
39
43
40
39
40
Data Set 2
46
37
40
33
42
36
40
47
34
45
참고사이트 : https://ggplot2.tidyverse.org/reference/geom_dotplot.html
The range of a data set is the number R defined by the formula
where is the largest measurement in the data set and is the smallest.
EXAMPLE 23. 앞의 2개의 데이터 세트의 range를 구하라.
[Solution]
Note : range( )
function of R returns the minimum value and the maxim value of the data set.
EXAMPLE 24. 위의 예에서 Data Set 2의 sample variance와 sample standard deviation을 구하라.
[Solution]
Note : In R, var()
returns the sample variance, i.e. the denominator used in var() function is (n-1)
.
which by algebra is equivalent to the formula
EXAMPLE 25. 무작위로 선발한 10명의 학생의 평균 평점은 다음과 같다. sample variance와 sample standard deviation을 구하라.
[풀이]
[ Difference between Two Data Sets ]
See : Degree of Dispersion
평균에 대한 상대적인 변동성의 크기를 설명할 때에 변동계수(Coefficient of Variation)을 사용한다.
평균에 대한 표준편차의 비율로 표현된다.
변동계수가 클수록, 즉 표준편차가 표본평균에 비해 클수록 자료의 퍼짐진 정도가 더 크다고 할 수 있다.
EXAMPLE 26. Example 25.의 CV를 구하라.
[ Solution ]
离散程度的统计量
1) 全距(range)
2) 标准差(standard deviation)
3) 方差(variance)
4) 变异系数(coefficient of variation)
The sample variance of a set of sample data is the number defined by the formula
The sample standard deviation of a set of sample data is the square root of the sample variance, hence is the number given by the formulas
The population variance and population standard deviation of a set of population data are the numbers and defined by the formulas
and
In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as a percentage, and is defined as the ratio of the standard deviation to the mean (or its absolute value, ).