๐Ÿ’ป R :: Apply function family

scdw.lgยท2020๋…„ 9์›” 8์ผ
0

Data Science

๋ชฉ๋ก ๋ณด๊ธฐ
1/1

Introduction

R ์–ธ์–ด์—์„œ apply ํ•จ์ˆ˜๋Š” ๋ฐ˜๋ณต๋ฌธ์„ ๋Œ€์ฒดํ•œ๋‹ค. Javascript์—์„œ์˜ foreach ํ•จ์ˆ˜์™€ ๋น„์Šทํ•œ ๊ธฐ๋Šฅ์„ ํ•œ๋‹ค๊ณ  ๋ณผ ์ˆ˜๋„ ์žˆ๊ฒ ๋‹ค. apply function "family" ๋ผ๊ณ  ํ•œ ์ด์œ ๋Š” apply() ํ•จ์ˆ˜์—์„œ ํŒŒ์ƒ๋œ lapply, sapply, mapply ๋“ฑ์˜ ํ•จ์ˆ˜๋“ค์ด ์—ฌ๋Ÿฌ ๊ฐœ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ์•”ํŠผ vector, matrix ๋“ฑ์˜ data๋ฅผ ์ฃผ๋กœ ๋‹ค๋ฃจ๋Š” R ํ”„๋กœ๊ทธ๋ž˜๋ฐ์—์„œ apply ํ•จ์ˆ˜๋Š” ํ•„์ˆ˜์ ์ด๋‹ค.

์‚ฌ์‹ค apply function family ์ค‘์—์„œ ๊ฐ€์žฅ ์ž์ฃผ ์“ฐ์ด๋Š” ๊ฑด apply ํ•จ์ˆ˜์ด๋ฏ€๋กœ ๋‹ค๋ฅธ ํ•จ์ˆ˜๋“ค์€ ๋Œ€๊ฐ• ์–ด๋–จ ๋•Œ ์“ฐ๋Š”์ง€๋งŒ ๋‹ค๋ฃจ๊ณ  ๋„˜์–ด๊ฐ€๋ ค๊ณ  ํ•œ๋‹ค.

3๊ฐ€์ง€ ๊ธฐ๋ณธ apply ํ•จ์ˆ˜

apply(Object, Margin, Function) : matrix objects, returns a vector
lapply(Object, Function) : vector, matrix, list, data frame objects, returns a list
sapply(Object, Function) : vector, matrix, list, data frame objects, returns a vector

์œ„์˜ ์„ธ ๊ฐœ ํ•จ์ˆ˜๊ฐ€ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ apply ํ•จ์ˆ˜๋กœ ์“ฐ์ธ๋‹ค.

apply

  • apply ํ•จ์ˆ˜๋Š” ๋ฐ˜๋ณต๋ฌธ์„ ํ˜•์„ฑํ•ด Function์„ Object์˜ ๊ฐ element์— ์ ์šฉ์‹œํ‚จ๋‹ค.
  • ์ด๋•Œ Margin parameter๊ฐ€ ๋ฐ˜๋ณต์˜ ๋ฐฉํ–ฅ์„ ์ •ํ•œ๋‹ค. Margin์ด 1์ด๋ฉด ๋ชจ๋“  ํ–‰(row)์— ๋Œ€ํ•ด ๋ฐ˜๋ณตํ•˜๊ณ , Margin์ด 2์ด๋ฉด ๋ชจ๋“  ์—ด(column)์— ๋Œ€ํ•ด ๋ฐ˜๋ณตํ•œ๋‹ค. (R ์–ธ์–ด๋Š” row๋ฅผ 1, column์„ 2๋กœ ๊ฐ„์ฃผํ•˜๊ณ  ์ฒ˜๋ฆฌํ•˜๋‹ˆ ์žŠ์ง€ ๋ง ๊ฒƒ)
  • apply ํ•จ์ˆ˜๋Š” matrix Object์— ๋Œ€ํ•ด์„œ๋งŒ ์ ์šฉ๋œ๋‹ค. vector Object์—๋Š” ์ ์šฉํ•  ์ˆ˜ ์—†๋‹ค!
  • ๋งŒ์•ฝ data frame Object๊ฐ€ parameter๋กœ ์ „๋‹ฌ๋˜๋ฉด data frame์ด ๋จผ์ € matrix๋กœ ๋ณ€ํ™˜๋˜๊ณ  ๋‚œ ๋’ค์— ํ•จ์ˆ˜๊ฐ€ ์ ์šฉ๋œ๋‹ค.

lapply, sapply

  • lapply์™€ sapply ํ•จ์ˆ˜๋Š” ๋‘˜ ๋‹ค vector, matrix, list, data frame Object ์— ๋ชจ๋‘ ์ ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค.
  • ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ object์˜ ๊ฐ ์›์†Œ๋ฅผ ์ˆœํšŒํ•˜๋ฉฐ ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•œ๋‹ค. lapply ํ•จ์ˆ˜์™€ sapply ํ•จ์ˆ˜์˜ ์ฐจ์ด์ ์€ ๋ฆฌํ„ด๊ฐ’์ด๋‹ค. lapply ํ•จ์ˆ˜๋Š” list๋ฅผ ๋ฆฌํ„ดํ•˜๊ณ , sapply ํ•จ์ˆ˜๋Š” simplied form = vector๋กœ ๋ฆฌํ„ดํ•œ๋‹ค.
  • ์œ„์˜ ์ด์œ  ๋•Œ๋ฌธ์— list apply, simplied form apply ์˜ ์•ฝ์ž๋กœ lapply, sapply ํ•จ์ˆ˜๊ฐ€ ๋˜์ง€ ์•Š์•˜๋‚˜ ์‹ถ๋‹ค.

๊ทธ ์™ธ ํ•จ์ˆ˜๋“ค

  • vapply() : sapply() ํ•จ์ˆ˜์™€ ๋น„์Šทํ•œ๋ฐ, output format = return format ์„ ๋ช…ํ™•ํ•˜๊ฒŒ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์–ด ์•ˆ์ „ํ•˜๊ฒŒ ์—ฐ์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค.
  • tapply() : ๊ทธ๋ฃน๋ณ„ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ํ•จ์ˆ˜์ด๋‹ค. ๊ทธ๋ฃน์„ parameter๋กœ ๋„˜๊ธฐ๊ณ (factor type) element๋ณ„ ์ฒ˜๋ฆฌ๊ฐ€ ์•„๋‹Œ, ๊ทธ๋ฃน๋ณ„ ์ฒ˜๋ฆฌ๋ฅผ ํ•œ๋‹ค.
  • mapply() : sapply() ํ•จ์ˆ˜์™€ ๋น„์Šทํ•˜์ง€๋งŒ multiple parameters ๊ฐ€ ๋„˜๊ฒจ์ง„๋‹ค. ๊ธฐ์กด apply ํ•จ์ˆ˜๋“ค์—์„œ๋Š” data๋ฅผ ๋จผ์ € ์ธ์ž๋กœ ๋„˜๊ฒผ์ง€๋งŒ, mapply์—์„œ๋Š” ํ•จ์ˆ˜๋ฅผ ๋จผ์ € ์ธ์ž๋กœ ๋„˜๊ธด๋‹ค.

Practices

> library(readxl)
> t1 <- read_excel(path = "titanic3.xls", sheet = "titanic3")
## ๊ฒฝ๊ณ ๋ฉ”์‹œ์ง€(๋“ค): 
## In read_fun(path = enc2native(normalizePath(path)), sheet_i = sheet,  :
## Coercing text to numeric in M1306 / R1306C13: '328'
> str(t1)
## tibble [1,309 x 14] (S3: tbl_df/tbl/data.frame)
## $ pclass   : num [1:1309] 1 1 1 1 1 1 1 1 1 1 ...
## $ survived : num [1:1309] 1 1 0 0 0 1 1 0 1 0 ...
## $ name     : chr [1:1309] "Allen, Miss. Elisabeth Walton" "Allison, Master. Hudson Trevor" "Allison, Miss. Helen Loraine" "Allison, Mr. Hudson Joshua Creighton" ...
## $ sex      : chr [1:1309] "female" "male" "female" "male" ...
## $ age      : num [1:1309] 29 0.917 2 30 25 ...
## $ sibsp    : num [1:1309] 0 1 1 1 1 0 1 0 2 0 ...
## $ parch    : num [1:1309] 0 2 2 2 2 0 0 0 0 0 ...
## $ ticket   : chr [1:1309] "24160" "113781" "113781" "113781" ...
## $ fare     : num [1:1309] 211 152 152 152 152 ...
## $ cabin    : chr [1:1309] "B5" "C22 C26" "C22 C26" "C22 C26" ...
## $ embarked : chr [1:1309] "S" "S" "S" "S" ...
## $ boat     : chr [1:1309] "2" "11" NA NA ...
## $ body     : num [1:1309] NA NA NA 135 NA NA NA NA NA 22 ...
## $ home.dest: chr [1:1309] "St Louis, MO" "Montreal, PQ / Chesterville, ON" "Montreal, PQ / Chesterville, ON" "Montreal, PQ / Chesterville, ON" ...
> apply(t1, 2, typeof)
##      pclass    survived        name         sex         age       sibsp 
## "character" "character" "character" "character" "character" "character" 
##       parch      ticket        fare       cabin    embarked        boat 
## "character" "character" "character" "character" "character" "character" 
##        body   home.dest 
## "character" "character" 
> lapply(t1, typeof)
> sapply(t1, typeof)
# lapply(t1, typeof)
## $pclass
## [1] "double"
## 
## $survived
## [1] "double"
## 
## $name
## [1] "character"
## 
## $sex
## [1] "character"
## 
## $age
## [1] "double"
## 
## $sibsp
## [1] "double"
## 
## $parch
## [1] "double"
## 
## $ticket
## [1] "character"
## 
## $fare
## [1] "double"
## 
## $cabin
## [1] "character"
## 
## $embarked
## [1] "character"
## 
## $boat
## [1] "character"
## 
## $body
## [1] "double"
## 
## $home.dest
## [1] "character"

# sapply(t1, typeof)
##     pclass    survived        name         sex         age       sibsp 
##   "double"    "double" "character" "character"    "double"    "double" 
##      parch      ticket        fare       cabin    embarked        boat 
##   "double" "character"    "double" "character" "character" "character" 
##       body   home.dest 
##   "double" "character" 

apply ํ•จ์ˆ˜ ๋Œ€์‹  for loops ์‚ฌ์šฉํ•˜๊ธฐ

> m1 <- matrix(1:20, nrow=4, ncol=5)
> myapplyCol <- function(Obj, Margin, Func, ...) {
    result <- vector()
    for(i in 1:ncol(Obj))
      result <- c(result, Func(Obj[,i], ...))
    return(result)
  }
> myapplyRow <- function(Obj, Margin, Func, ...) {
    result <- vector()
    for(i in 1:nrow(Obj))
      result <- c(result, Func(Obj[i,], ...))
    return(result)
  }
> m1
> myapplyCol(m1, 2, sum)
> myapplyRow(m1, 1, sum)
# return m1 (check for matrix that we made)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    5    9   13   17
## [2,]    2    6   10   14   18
## [3,]    3    7   11   15   19
## [4,]    4    8   12   16   20

## [1] 10 26 42 58 74  # myapplyCol : apply sum() for every column
## [1] 45 50 55 60              # myappleRow : apply sum() for every row

๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธ์Šค ๊ณผ๋ชฉ ๊ณต๋ถ€ํ•˜๋ฉด์„œ ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค ๐Ÿ˜Š

0๊ฐœ์˜ ๋Œ“๊ธ€