2-1. Three Popular Data Displays

500KB
Open
Chapter 2 (Korean)
Chapter 2 (Chinese)

1. Stem and Leaf Diagrams

EXAMPLE 1. 톡계학 κ°•μ˜λ₯Ό λ“£κ³  μžˆλŠ” 30λͺ… ν•™μƒμ˜ μ‹œν—˜ 성적은 λ‹€μŒκ³Ό κ°™λ‹€.

86 80 25 77  73 76 100 90 69 93
90 83 70 73  73 70  90 83 71 95
40 58 68 69 100 78  87 97 92 73

μœ„μ˜ 데이터λ₯Ό κ°€μ‹œν™”ν•˜κΈ° μœ„ν•œ 방법 쀑 ν•˜λ‚˜κ°€ stem and leaf diagram이닀.

Rμ—μ„œλŠ” μ΄λŸ¬ν•œ 데이터 ν‘œν˜„μ„ μœ„ν•΄ stem() ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•œλ‹€.

Syntax :

stem(x, scale = 1, width = 80, atom = 1e-08)

arguments :

  • x : μˆ˜μΉ˜ν˜• 벑터

  • scale = : ν”Œλ‘―μ˜ 길이λ₯Ό μ œμ–΄

  • width = : μ›ν•˜λŠ” ν”Œλ‘―μ˜ 넓이

  • atom = : tolerance

[Solution]

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)
stem(score)

10의 자리 μˆ«μžκ°€ stem이 되고, 1의 자리 μˆ«μžκ°€ leafκ°€ 됨을 μ•Œ 수 μžˆλ‹€.

EXAMPLE 2. stem의 갯수λ₯Ό 반으둜 μ€„μ—¬μ„œ diagram을 그렀라.

[Solution]

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)
           
stem(score, scale = 0.5)    # stem의 갯수λ₯Ό 50%둜 μ€„μž„ -> 2, 4, 6, 8, 10 λ“±

2. Frequency Histograms

stem and leaf diagram은 λŒ€κ·œλͺ¨ 데이터 μ„ΈνŠΈμ—λŠ” μ ν•©ν•˜μ§€ μ•Šλ‹€.

이 λ•Œ μ‚¬μš©λ˜λŠ” 방법이 λ„μˆ˜ 뢄포이닀.

hist(x, main = paste("Histogram of ", xname), 
        xlim = range(breaks),
        ylim = NULL,
        xlab = xname, 
        ylab = 
        ... )

arguments :

  • x : νžˆμŠ€ν† κ·Έλž¨μ˜ 벑터 데이터

  • main = : νžˆμŠ€ν† κ·Έλž¨μ˜ 제λͺ©

  • xlim = : x μΆ•μ˜ λ²”μœ„

  • ylim = : y μΆ•μ˜ λ²”μœ„

  • xlab = : x μΆ•μ˜ 제λͺ©

  • ylab = : y μΆ•μ˜ 제λͺ©

EXAMPLE 3. μ΄μ „μ˜ stem and leaf diagram을 frequency histogram으둜 그렀라.

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)

hist(score,
          xlim = c(0, 110),
          ylim = c(0, 12),
)

EXAMPLE 4. Using histogram()

require(lattice)
require(openintro)

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)
           
histogram(score, type = "count",
          xlim = c(0, 110),
          ylim = c(0, 12),
          breaks = seq(5, 105, by=10))

EXAMPLE 5. Histogram of iris

str(iris)       # iris is a dataset

# partitioning of Graphic Display, 2 by 2
par(mfrow = c(2,2))   

# 1. Drawing Histograms
for (k in 1:4) hist(iris[[k]])

# 2. Redrawing the Histograms

# 2-1) Making Main Title of the Histogram
title <- paste0("Histogram of ", colnames(iris[1:4])) ; title

# 2-2) Color
col <- c("yellow", "lightgreen", "lightpink", "skyblue"); col

# 2-3) Redrawing
for (k in 1:4) hist(iris[[k]], 
                    main=title[k], 
                    xlab=colnames(iris[k]),
                    ylab="Frequency",
                    col = col[k])

3. Relative Frequency Histogram

EXAMPLE 6. Relative Frequency Histogram of Example 3 using histogram()

require(lattice)
require(openintro)

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)
           
histogram(score, type = "percent",
          xlim = c(0, 110),
          ylim = c(0, 40),
          breaks = seq(5, 105, by=10))

Note : y μΆ•μ˜ 값이 갯수(count)κ°€ μ•„λ‹Œ λ°±λΆ„μœ¨(percent)둜 좜λ ₯λœλ‹€.

4. Sample size and Relative Frequency Histograms

sample sizeκ°€ 컀짐에 따라 전체 λͺ¨μ–‘은 쒌우 λŒ€μΉ­μ˜ μ’… λͺ¨μ–‘이 λœλ‹€.

5. A Very Fine Relative Frequency Histogram

6. Frequency Table

EXAMPLE 7. Using iris data set, Find the frequency table of the 2nd column(Sepal.Width) of iris.

library(Rstat)

# import iris data set
data(iris)
# data structure of iris
str(iris)

# select the 2nd column
x <- iris[[2]]


# 1. frequency table
freq.table(x)

# 2. frequenct table & the yellow histogram
freq.table(x, mp=TRUE, col=7)

# 3. Change the class interval as 0.5
(mycut <- seq(2, 4.5, by=0.5))
freq.table(x, cut=mycut)
freq.table(x, cut=mycut, mp=TRUE, col=0)

8. Unstable Histogram

  • Type-A : Isolated Island

  • Type-B : Multimodal

  • Type-C : Outliers

  • Type-D : Cliff

EXAMPLE 8. Unstable Histogram

library(Rstat)

# 1. Types of Unstable Histogram
unstable.hist()           # refer to ch2.man(2)

# 2. Changing the Parameters of unstable.hist()
unstable.hist(N=100, m2=4, a=11, b=12, c=8, vc=rainbow(4))

See : Using Histograms to Understand Your Data

9. Contingency Table (Cross table)

EXAMPLE 9. Using exa2_2 data set, Find the table of each one.

  1. Frequency table of the 2nd column

  2. Frequency table of the 3rd column

  3. Contingency table of the 2nd and the 3rd columns.

library(Rstat)

# data import
data(exa2_2)                # exa2_2 is a dataset of Rstat
x <- exa2_2
str(x)

# 1. Frequency table of the 2nd Column
x2 <- x[[2]] ; x2           # x2 : factor variable
x21 <- table(x)  ; x21      # 
x22 <- prop.table(x21) ; round(x22,2)
x23 <- addmargins(x22) ; round(x23,2)

# 2. Frequency table of the 3rd column
x3 <- x[[3]] ; x3          # x3 : factor variable  
x31 <- table(x3)   ; x31   #  
x32 <- prop.table(x31) ; round(x32,2)
x33 <- addmargins(x32) ; round(x33,2)

# 3. Contingency table of the 2nd and the 3rd Columns
x41 <- table(x2, x3) ; x41
x42 <- prop.table(x41)       ; round(x42,2)
x43 <- addmargins(x42)       ; round(x43,2)

1οΌ‰ι’‘ζ•°(εˆ†εΈƒ)葨(frequency table)。 2)相对​钑数(εˆ†εΈƒ)葨​(relative frequency table)。 3οΌ‰ι’‘ζ•°εˆ†εΈƒε›Ύ(frequency diagram)。 4οΌ‰η›Έε―Ήι’‘ζ•°εˆ†εΈƒε›Ύ(relative frequency diagram)。 5οΌ‰θŒŽεΆε›Ύ(stem and leaf diagram)。 6οΌ‰ζƒ…ε½’εˆ†ζžθ‘¨(contingency table)。

Last updated

Was this helpful?