2-1. Three Popular Data Displays

500KB
Open
Chapter 2 (Korean)
Chapter 2 (Chinese)

1. Stem and Leaf Diagrams

EXAMPLE 1. ํ†ต๊ณ„ํ•™ ๊ฐ•์˜๋ฅผ ๋“ฃ๊ณ  ์žˆ๋Š” 30๋ช… ํ•™์ƒ์˜ ์‹œํ—˜ ์„ฑ์ ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

86 80 25 77  73 76 100 90 69 93
90 83 70 73  73 70  90 83 71 95
40 58 68 69 100 78  87 97 92 73

์œ„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์‹œํ™”ํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๊ฐ€ stem and leaf diagram์ด๋‹ค.

R์—์„œ๋Š” ์ด๋Ÿฌํ•œ ๋ฐ์ดํ„ฐ ํ‘œํ˜„์„ ์œ„ํ•ด stem() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

Syntax :

stem(x, scale = 1, width = 80, atom = 1e-08)

arguments :

  • x : ์ˆ˜์น˜ํ˜• ๋ฒกํ„ฐ

  • scale = : ํ”Œ๋กฏ์˜ ๊ธธ์ด๋ฅผ ์ œ์–ด

  • width = : ์›ํ•˜๋Š” ํ”Œ๋กฏ์˜ ๋„“์ด

  • atom = : tolerance

[Solution]

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)
stem(score)

10์˜ ์ž๋ฆฌ ์ˆซ์ž๊ฐ€ stem์ด ๋˜๊ณ , 1์˜ ์ž๋ฆฌ ์ˆซ์ž๊ฐ€ leaf๊ฐ€ ๋จ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

EXAMPLE 2. stem์˜ ๊ฐฏ์ˆ˜๋ฅผ ๋ฐ˜์œผ๋กœ ์ค„์—ฌ์„œ diagram์„ ๊ทธ๋ ค๋ผ.

[Solution]

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)
           
stem(score, scale = 0.5)    # stem์˜ ๊ฐฏ์ˆ˜๋ฅผ 50%๋กœ ์ค„์ž„ -> 2, 4, 6, 8, 10 ๋“ฑ

2. Frequency Histograms

stem and leaf diagram์€ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ์„ธํŠธ์—๋Š” ์ ํ•ฉํ•˜์ง€ ์•Š๋‹ค.

์ด ๋•Œ ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ•์ด ๋„์ˆ˜ ๋ถ„ํฌ์ด๋‹ค.

hist(x, main = paste("Histogram of ", xname), 
        xlim = range(breaks),
        ylim = NULL,
        xlab = xname, 
        ylab = 
        ... )

arguments :

  • x : ํžˆ์Šคํ† ๊ทธ๋žจ์˜ ๋ฒกํ„ฐ ๋ฐ์ดํ„ฐ

  • main = : ํžˆ์Šคํ† ๊ทธ๋žจ์˜ ์ œ๋ชฉ

  • xlim = : x ์ถ•์˜ ๋ฒ”์œ„

  • ylim = : y ์ถ•์˜ ๋ฒ”์œ„

  • xlab = : x ์ถ•์˜ ์ œ๋ชฉ

  • ylab = : y ์ถ•์˜ ์ œ๋ชฉ

EXAMPLE 3. ์ด์ „์˜ stem and leaf diagram์„ frequency histogram์œผ๋กœ ๊ทธ๋ ค๋ผ.

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)

hist(score,
          xlim = c(0, 110),
          ylim = c(0, 12),
)

EXAMPLE 4. Using histogram()

require(lattice)
require(openintro)

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)
           
histogram(score, type = "count",
          xlim = c(0, 110),
          ylim = c(0, 12),
          breaks = seq(5, 105, by=10))

EXAMPLE 5. Histogram of iris

str(iris)       # iris is a dataset

# partitioning of Graphic Display, 2 by 2
par(mfrow = c(2,2))   

# 1. Drawing Histograms
for (k in 1:4) hist(iris[[k]])

# 2. Redrawing the Histograms

# 2-1) Making Main Title of the Histogram
title <- paste0("Histogram of ", colnames(iris[1:4])) ; title

# 2-2) Color
col <- c("yellow", "lightgreen", "lightpink", "skyblue"); col

# 2-3) Redrawing
for (k in 1:4) hist(iris[[k]], 
                    main=title[k], 
                    xlab=colnames(iris[k]),
                    ylab="Frequency",
                    col = col[k])

3. Relative Frequency Histogram

EXAMPLE 6. Relative Frequency Histogram of Example 3 using histogram()

require(lattice)
require(openintro)

score <- c(86, 80, 25, 77, 73,  76, 100, 90, 69, 93,
           90, 83, 70, 73, 73,  70,  90, 83, 71, 95,
           40, 58, 68, 69, 100, 78,  87, 97, 92, 73)
           
histogram(score, type = "percent",
          xlim = c(0, 110),
          ylim = c(0, 40),
          breaks = seq(5, 105, by=10))

Note : y ์ถ•์˜ ๊ฐ’์ด ๊ฐฏ์ˆ˜(count)๊ฐ€ ์•„๋‹Œ ๋ฐฑ๋ถ„์œจ(percent)๋กœ ์ถœ๋ ฅ๋œ๋‹ค.

4. Sample size and Relative Frequency Histograms

sample size๊ฐ€ ์ปค์ง์— ๋”ฐ๋ผ ์ „์ฒด ๋ชจ์–‘์€ ์ขŒ์šฐ ๋Œ€์นญ์˜ ์ข… ๋ชจ์–‘์ด ๋œ๋‹ค.

5. A Very Fine Relative Frequency Histogram

6. Frequency Table

EXAMPLE 7. Using iris data set, Find the frequency table of the 2nd column(Sepal.Width) of iris.

library(Rstat)

# import iris data set
data(iris)
# data structure of iris
str(iris)

# select the 2nd column
x <- iris[[2]]


# 1. frequency table
freq.table(x)

# 2. frequenct table & the yellow histogram
freq.table(x, mp=TRUE, col=7)

# 3. Change the class interval as 0.5
(mycut <- seq(2, 4.5, by=0.5))
freq.table(x, cut=mycut)
freq.table(x, cut=mycut, mp=TRUE, col=0)

8. Unstable Histogram

  • Type-A : Isolated Island

  • Type-B : Multimodal

  • Type-C : Outliers

  • Type-D : Cliff

EXAMPLE 8. Unstable Histogram

library(Rstat)

# 1. Types of Unstable Histogram
unstable.hist()           # refer to ch2.man(2)

# 2. Changing the Parameters of unstable.hist()
unstable.hist(N=100, m2=4, a=11, b=12, c=8, vc=rainbow(4))

See : Using Histograms to Understand Your Data

9. Contingency Table (Cross table)

EXAMPLE 9. Using exa2_2 data set, Find the table of each one.

  1. Frequency table of the 2nd column

  2. Frequency table of the 3rd column

  3. Contingency table of the 2nd and the 3rd columns.

library(Rstat)

# data import
data(exa2_2)                # exa2_2 is a dataset of Rstat
x <- exa2_2
str(x)

# 1. Frequency table of the 2nd Column
x2 <- x[[2]] ; x2           # x2 : factor variable
x21 <- table(x)  ; x21      # 
x22 <- prop.table(x21) ; round(x22,2)
x23 <- addmargins(x22) ; round(x23,2)

# 2. Frequency table of the 3rd column
x3 <- x[[3]] ; x3          # x3 : factor variable  
x31 <- table(x3)   ; x31   #  
x32 <- prop.table(x31) ; round(x32,2)
x33 <- addmargins(x32) ; round(x33,2)

# 3. Contingency table of the 2nd and the 3rd Columns
x41 <- table(x2, x3) ; x41
x42 <- prop.table(x41)       ; round(x42,2)
x43 <- addmargins(x42)       ; round(x43,2)

1๏ผ‰้ข‘ๆ•ฐ(ๅˆ†ๅธƒ)่กจ(frequency table)ใ€‚ 2๏ผ‰็›ธๅฏนโ€‹้ข‘ๆ•ฐ(ๅˆ†ๅธƒ)่กจโ€‹(relative frequency table)ใ€‚ 3๏ผ‰้ข‘ๆ•ฐๅˆ†ๅธƒๅ›พ(frequency diagram)ใ€‚ 4๏ผ‰็›ธๅฏน้ข‘ๆ•ฐๅˆ†ๅธƒๅ›พ(relative frequency diagram)ใ€‚ 5๏ผ‰่ŒŽๅถๅ›พ(stem and leaf diagram)ใ€‚ 6๏ผ‰ๆƒ…ๅฝขๅˆ†ๆž่กจ(contingency table)ใ€‚

Last updated

Was this helpful?