[EDA/Python] Playing with Pandas ๐Ÿ“Š๐Ÿผ 2ํŽธ

SengMin Youn ์œค์„ฑ๋ฏผยท2023๋…„ 10์›” 21์ผ
post-thumbnail

Applying Functions to Dataframes

์‚ฌ์‹ค ํ•„์ž๋Š” ์˜์–ด๊ฐ€ ๋” ํŽธํ•˜๋‹ค. ํ•„์ž์˜ ํ•™์Šต์„ ์œ„ํ•ด ์ž‘์„ฑ ์ค‘์ธ ๋ธ”๋กœ๊ทธ์ธ ๋งŒํผ ํ•œ๊ตญ๋ง์ด ์ƒ๊ฐ๋‚˜์ง€ ์•Š์œผ๋ฉด ๊ทธ๋ƒฅ ์˜์–ด๋กœ ์ ๋„๋ก ํ•˜๊ฒ ๋‹ค. ๐Ÿผ

apply()๋ฅผ ํ™œ์šฉํ•˜์—ฌ df ํ˜น์€ Series์— ํ•จ์ˆ˜๋ฅผ ์ ์šฉ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค.

"Dataframe C"

display(C)
G = C.copy() #Copy๋ฅผ ํ•˜์ง€ ์•Š์œผ๋ฉด the dataframe is modified 
G['year] = G['year'].apply(lambda x: "'{:02d}".format(x % 100)) 
display(G) 

"Dataframe G"

์š”๋ ‡๊ฒŒ ์“ด๋‹ค ์ด๋ง์ด์•ผ. ๊ทผ๋ฐ ์ด๊ฒŒ ์ƒ๋‹นํžˆ ๋ณต์žกํ•˜๊ณ  ์œ ์šฉํ•ด์ง„๋‹ค.

๋‘ ์—ด์— ๋Œ€ํ•œ ์—ฐ์‚ฐ์„ ํ†ตํ•ด ์ƒˆ๋กœ์šด ์—ด ์ƒ์„ฑํ•˜๊ธฐ


์šฐ์„  axis = 0๊ณผ axis = 1์˜ ๋ฐฉํ–ฅ์„ ์žŠ์ง€ ๋ง์ž.

G['prevalence'] = G['cases'] / G['popuation'] 

๋ฌผ๋ก  ์œ„๊ฐ€ ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์ด์ง€๋งŒ applyํ•จ์ˆ˜๋ฅผ ํ™œ์šฉํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ž‘์„ฑํ•ด๋ณด์ž.

def calc_prevalence(G):
    assert 'cases' in G.columns and 'population' in G.columns
    F = G.copy()
    F['prevalence'] = F.apply(lambda row : row['cases']/row['population'], axis=1)
    
    return F
display(calc_prevalence(G))
profile
An Aspiring Back-end Developer

1๊ฐœ์˜ ๋Œ“๊ธ€

comment-user-thumbnail
2023๋…„ 10์›” 21์ผ

์™œ์ธ์ง€ ํ‘ธ๋ฐ”์˜ค ๋‹ฎ์œผ์…จ์„ ๊ฑฐ ๊ฐ™์•„์šค>< ์œ ์ตํ•œ ์ •๋ณด ๊ฐ์‚ฌํ•ด์šฉ!!

๋‹ต๊ธ€ ๋‹ฌ๊ธฐ