[Deep Learning] Overfitting, Underfitting

๊น€ํฌ์ง„ยท2021๋…„ 4์›” 5์ผ
0

DeepLearning

๋ชฉ๋ก ๋ณด๊ธฐ
11/12
post-thumbnail

๐Ÿ“– ์ผ€๋ผ์Šค ์ฐฝ์‹œ์ž์—๊ฒŒ ๋ฐฐ์šฐ๋Š” ๋”ฅ๋Ÿฌ๋‹ (ํ”„๋ž‘์†Œ์™€ ์ˆ„๋ ˆ, ๋ฐ•ํ•ด์„ , ๊ธธ๋ฒ—) ์ฐธ๊ณ 

๐ŸŒธ Optimization & Generalization

Optimization(์ตœ์ ํ™”)๋Š” ๊ฐ€๋Šฅํ•œ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์—์„œ ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ์„ ์–ป๊ธฐ ์œ„ํ•ด ๋ชจ๋ธ์„ ์กฐ์ •ํ•˜๋Š” ๊ณผ์ •, ์ฆ‰ ๋จธ์‹ ๋Ÿฌ๋‹์ด ํ•™์Šตํ•˜๋Š” ๊ณผ์ •์ด๋‹ค. Generalization(์ผ๋ฐ˜ํ™”)๋Š” ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์ด ์ด์ „์— ๋ณธ ์  ์—†๋Š” ๋ฐ์ดํ„ฐ์—์„œ ์–ผ๋งˆ๋‚˜ ์ž˜ ์ˆ˜ํ–‰๋˜๋Š”์ง€ ์˜๋ฏธํ•œ๋‹ค. ๋ชจ๋ธ์„ ๋งŒ๋“œ๋Š” ๋ชฉ์ ์€ ์ข‹์€ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ์–ป๋Š” ๊ฒƒ์ด์ง€๋งŒ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ์ œ์–ดํ•  ์ˆ˜๋Š” ์—†๋‹ค.

๐ŸŒธ Underfitting

๊ณผ๋Œ€์ ํ•ฉ๊ณผ ๋ฐ˜๋Œ€๋กœ ๋ชจ๋ธ์ด ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์— ์žˆ๋Š” ๊ด€๋ จ ํŠน์„ฑ์„ ๋ชจ๋‘ ํ•™์Šตํ•˜์ง€ ๋ชปํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋ฐ”๋กœ Underfitting, ๊ณผ์†Œ์ ํ•ฉ์ด๋‹ค. ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์ด ์†์‹ค์ด ๋‚ฎ์•„์งˆ์ˆ˜๋ก ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ์†์‹ค์ด ๋‚ฎ์•„์ง€๋ฉฐ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๊ณ„์† ๋ฐœ์ „๋  ์—ฌ์ง€๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ์ด๋‹ค.

๐ŸŒธ Overfitting

ํ•™์Šตํ•œ ๊ฒฐ๊ณผ๊ฐ€ Training Data์—๋งŒ ์ตœ์ ํ™”๋œ ๋ชจ๋ธ์— ๋Œ€ํ•ด Overfitting, ๊ณผ๋Œ€์ ํ•ฉ๋˜์—ˆ๋‹ค๊ณ  ํ‘œํ˜„ํ•œ๋‹ค. ์ฆ‰, ๋ชจ๋ธ ์ƒ์„ฑ ์‹œ ํ™œ์šฉํ•˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ๋Š” ์„ฑ๋Šฅ์ด ๊ธ‰๊ฒฉํ•˜๊ฒŒ ๋‚ฎ์•„์ง€๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค. ๊ณผ๋Œ€์ ํ•ฉ์€ ๋ชจ๋“  ๋จธ์‹  ๋Ÿฌ๋‹ ๋ฌธ์ œ์—์„œ ๋ฐœ์ƒํ•œ๋‹ค. ๊ณผ๋Œ€์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๋Š” ๊ฐ€์žฅ ์ข‹์€ ๋ฐฉ๋ฒ•์€ ๋” ๋งŽ์€ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ์œผ๋Š” ๊ฒƒ์ด๋‹ค. ํ•˜์ง€๋งŒ ํ˜„์‹ค์ ์œผ๋กœ๋Š” ์ด๋ฏธ ์ฃผ์–ด์ง„ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ ์™ธ์— ๋” ๋งŽ์€ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ์œผ๋Š” ๊ฒƒ์ด ๋ถˆ๊ฐ€๋Šฅํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋Œ€๋‹ค์ˆ˜์ด๋‹ค.

๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— ๊ณผ๋Œ€์ ํ•ฉ์„ ํ”ผํ•˜๊ธฐ ์œ„ํ•œ ์ฒ˜๋ฆฌ ๊ณผ์ •์ธ ๊ทœ์ œ๊ฐ€ ์กด์žฌํ•œ๋‹ค.

L1 Regularization

L2 Regularization

Elastic Net

Dropout

0๊ฐœ์˜ ๋Œ“๊ธ€

๊ด€๋ จ ์ฑ„์šฉ ์ •๋ณด