2014/11/17

[R] 第3篇 資料結構 (下)

R 的資料結構

4. Data Frame
Data frame 是最常用的資料結構,與之前的 matrix 不同,data frame 可以儲存不同型態的資料。
若要將 data frame 換成 matrix,可以使用 data.matrix()
> id <- c("a101", "b201", "c102")
> course <- c("chinese", "art", "english")
> stuCount <- c(30, 40, 35)
> df <- data.frame(id, course, stuCount)
> df
    id  course stuCount
1 a101 chinese       30
2 b201     art       40
3 c102 english       35

## use attributes to get information about data frame
> nrow(df)
[1] 3
> ncol(df)
[1] 3
> dim(df)
[1] 3 3
> names(df)
[1] "id"       "course"   "stuCount"

## get element from data frame
> df$course
[1] chinese art     english
Levels: art chinese english
> df[2,]
    id course stuCount
2 b201    art       40
> df[,2]
[1] chinese art     english
Levels: art chinese english

5. Factor
Factor 可用來表示分類的資料,例如以性別來說,不是男性就是女性,factor 會給每個類別一個代表的數字,方便後續處理。還有像是有順序性的資料,例如教育程度,從國小、國中、高中及大學等,資料一定會屬於某一類別,也適合用 factor 表示。
> x <- factor(c("male", "female", "female", "male", "female"))
> x
[1] male   female female male   female
Levels: female male

## show more information about x
> table(x)
x
female   male 
     3      2 
> attr(x,"levels")
[1] "female" "male"  
> str(x)
 Factor w/ 2 levels "female","male": 2 1 1 2 1
> summary(x)
female   male 
     3      2 

## unclass() let the object become general data
> unclass(x)
[1] 2 1 1 2 1
attr(,"levels")
[1] "female" "male"

6. List
List 是 R 裡面較複雜的,顯示的時候,會出現 [[]] 的內容,其實 list 就是將一堆上面敘述的資料結構組合起來。可以給 list 中每個 element 一個 name,然後就能用 $ 來調用,如果沒有給 name,就是依照 list 的 index 來取其內容。
> a <- "sketch"
> b <- c(25, 35, 21)
> c <- matrix(1:10, nrow = 5)
> ex <- list(name = a, TaAges = b, c)
> ex
$name
[1] "sketch"

$TaAges
[1] 25 35 21

[[3]]
     [,1] [,2]
[1,]    1    6
[2,]    2    7
[3,]    3    8
[4,]    4    9
[5,]    5   10

## get element
> ex[[2]]
[1] 25 35 21
> ex$TaAges
[1] 25 35 21
## get the matrix then get the row = 5 & col = 2 element
> ex[[3]][5, 2]
[1] 10

沒有留言:

張貼留言