2-1. Three Popular Data Displays
1. Stem and Leaf Diagrams
EXAMPLE 1. ν΅κ³ν κ°μλ₯Ό λ£κ³ μλ 30λͺ νμμ μν μ±μ μ λ€μκ³Ό κ°λ€.
86 80 25 77 73 76 100 90 69 93
90 83 70 73 73 70 90 83 71 95
40 58 68 69 100 78 87 97 92 73
μμ λ°μ΄ν°λ₯Ό κ°μννκΈ° μν λ°©λ² μ€ νλκ° stem and leaf diagramμ΄λ€.
Rμμλ μ΄λ¬ν λ°μ΄ν° ννμ μν΄ stem()
ν¨μλ₯Ό μ¬μ©νλ€.
Syntax :
stem(x, scale = 1, width = 80, atom = 1e-08)
arguments :
x : μμΉν 벑ν°
scale = : νλ‘―μ κΈΈμ΄λ₯Ό μ μ΄
width = : μνλ νλ‘―μ λμ΄
atom = : tolerance
[Solution]
score <- c(86, 80, 25, 77, 73, 76, 100, 90, 69, 93,
90, 83, 70, 73, 73, 70, 90, 83, 71, 95,
40, 58, 68, 69, 100, 78, 87, 97, 92, 73)
stem(score)
10μ μ리 μ«μκ° stemμ΄ λκ³ , 1μ μ리 μ«μκ° leafκ° λ¨μ μ μ μλ€.
EXAMPLE 2. stemμ κ°―μλ₯Ό λ°μΌλ‘ μ€μ¬μ diagramμ κ·Έλ €λΌ.
[Solution]
score <- c(86, 80, 25, 77, 73, 76, 100, 90, 69, 93,
90, 83, 70, 73, 73, 70, 90, 83, 71, 95,
40, 58, 68, 69, 100, 78, 87, 97, 92, 73)
stem(score, scale = 0.5) # stemμ κ°―μλ₯Ό 50%λ‘ μ€μ -> 2, 4, 6, 8, 10 λ±
2. Frequency Histograms
stem and leaf diagramμ λκ·λͺ¨ λ°μ΄ν° μΈνΈμλ μ ν©νμ§ μλ€.
μ΄ λ μ¬μ©λλ λ°©λ²μ΄ λμ λΆν¬μ΄λ€.
hist(x, main = paste("Histogram of ", xname),
xlim = range(breaks),
ylim = NULL,
xlab = xname,
ylab =
... )
arguments :
x
: νμ€ν κ·Έλ¨μ λ²‘ν° λ°μ΄ν°main =
: νμ€ν κ·Έλ¨μ μ λͺ©xlim =
: x μΆμ λ²μylim =
: y μΆμ λ²μxlab =
: x μΆμ μ λͺ©ylab =
: y μΆμ μ λͺ©
EXAMPLE 3. μ΄μ μ stem and leaf diagramμ frequency histogramμΌλ‘ κ·Έλ €λΌ.
score <- c(86, 80, 25, 77, 73, 76, 100, 90, 69, 93,
90, 83, 70, 73, 73, 70, 90, 83, 71, 95,
40, 58, 68, 69, 100, 78, 87, 97, 92, 73)
hist(score,
xlim = c(0, 110),
ylim = c(0, 12),
)
EXAMPLE 4. Using histogram()
require(lattice)
require(openintro)
score <- c(86, 80, 25, 77, 73, 76, 100, 90, 69, 93,
90, 83, 70, 73, 73, 70, 90, 83, 71, 95,
40, 58, 68, 69, 100, 78, 87, 97, 92, 73)
histogram(score, type = "count",
xlim = c(0, 110),
ylim = c(0, 12),
breaks = seq(5, 105, by=10))
EXAMPLE 5. Histogram of iris
str(iris) # iris is a dataset
# partitioning of Graphic Display, 2 by 2
par(mfrow = c(2,2))
# 1. Drawing Histograms
for (k in 1:4) hist(iris[[k]])
# 2. Redrawing the Histograms
# 2-1) Making Main Title of the Histogram
title <- paste0("Histogram of ", colnames(iris[1:4])) ; title
# 2-2) Color
col <- c("yellow", "lightgreen", "lightpink", "skyblue"); col
# 2-3) Redrawing
for (k in 1:4) hist(iris[[k]],
main=title[k],
xlab=colnames(iris[k]),
ylab="Frequency",
col = col[k])
3. Relative Frequency Histogram
EXAMPLE 6. Relative Frequency Histogram of Example 3 using histogram()
require(lattice)
require(openintro)
score <- c(86, 80, 25, 77, 73, 76, 100, 90, 69, 93,
90, 83, 70, 73, 73, 70, 90, 83, 71, 95,
40, 58, 68, 69, 100, 78, 87, 97, 92, 73)
histogram(score, type = "percent",
xlim = c(0, 110),
ylim = c(0, 40),
breaks = seq(5, 105, by=10))
Note : y μΆμ κ°μ΄ κ°―μ(count)κ° μλ λ°±λΆμ¨(percent)λ‘ μΆλ ₯λλ€.
4. Sample size and Relative Frequency Histograms
sample sizeκ° μ»€μ§μ λ°λΌ μ 체 λͺ¨μμ μ’μ° λμΉμ μ’ λͺ¨μμ΄ λλ€.

5. A Very Fine Relative Frequency Histogram

6. Frequency Table
EXAMPLE 7. Using iris
data set, Find the frequency table of the 2nd column(Sepal.Width
) of iris
.
library(Rstat)
# import iris data set
data(iris)
# data structure of iris
str(iris)
# select the 2nd column
x <- iris[[2]]
# 1. frequency table
freq.table(x)
# 2. frequenct table & the yellow histogram
freq.table(x, mp=TRUE, col=7)
# 3. Change the class interval as 0.5
(mycut <- seq(2, 4.5, by=0.5))
freq.table(x, cut=mycut)
freq.table(x, cut=mycut, mp=TRUE, col=0)
8. Unstable Histogram
Type-A : Isolated Island
Type-B : Multimodal
Type-C : Outliers
Type-D : Cliff
EXAMPLE 8. Unstable Histogram
library(Rstat)
# 1. Types of Unstable Histogram
unstable.hist() # refer to ch2.man(2)
# 2. Changing the Parameters of unstable.hist()
unstable.hist(N=100, m2=4, a=11, b=12, c=8, vc=rainbow(4))
See : Using Histograms to Understand Your Data
9. Contingency Table (Cross table)
EXAMPLE 9. Using exa2_2
data set, Find the table of each one.
Frequency table of the 2nd column
Frequency table of the 3rd column
Contingency table of the 2nd and the 3rd columns.
library(Rstat)
# data import
data(exa2_2) # exa2_2 is a dataset of Rstat
x <- exa2_2
str(x)
# 1. Frequency table of the 2nd Column
x2 <- x[[2]] ; x2 # x2 : factor variable
x21 <- table(x) ; x21 #
x22 <- prop.table(x21) ; round(x22,2)
x23 <- addmargins(x22) ; round(x23,2)
# 2. Frequency table of the 3rd column
x3 <- x[[3]] ; x3 # x3 : factor variable
x31 <- table(x3) ; x31 #
x32 <- prop.table(x31) ; round(x32,2)
x33 <- addmargins(x32) ; round(x33,2)
# 3. Contingency table of the 2nd and the 3rd Columns
x41 <- table(x2, x3) ; x41
x42 <- prop.table(x41) ; round(x42,2)
x43 <- addmargins(x42) ; round(x43,2)
1οΌι’ζ°(εεΈ)葨(frequency table)γ 2οΌηΈε―Ήβι’ζ°(εεΈ)葨β(relative frequency table)γ 3οΌι’ζ°εεΈεΎ(frequency diagram)γ 4οΌηΈε―Ήι’ζ°εεΈεΎ(relative frequency diagram)γ 5οΌθεΆεΎ(stem and leaf diagram)γ 6οΌζ ε½’εζ葨(contingency table)γ
Last updated
Was this helpful?