๐ŸงƒOptimization

MURRAIYAยท2022๋…„ 12์›” 3์ผ
0
post-thumbnail

About Optimization

optimization์„ ์ฒ˜์Œ ์ ‘ํ•˜๋Š” ์‚ฌ๋žŒ์ด๋ผ๋ฉด ์ผ๋‹จ ์ด๊ฒƒ์ด ์–ด๋–ค multi variable scalar valued ํ•จ์ˆ˜์˜ max or min ๊ฐ’๊ณผ ๊ทธ ์ง€์ ์„ ์•Œ์•„๋‚ด๋Š” ๊ฒƒ์ด ๋ชฉ์ ์ด๋ผ๊ณ  ์ธ์‹ํ•˜๊ณ  ์‹œ์ž‘ํ•˜์ž.

  • Optimization์€ ํ•˜๋‚˜์˜ closed form์ด ์—†๊ณ  ์ด๋Ÿฐ ์ €๋Ÿฐ ๋ฐฉ๋ฒ•์ด ๋งŽ์ด ์žˆ๋‹ค.
  • Least Squares๋Š” tall A์— ๋Œ€ํ•ด ๊ทผ์‚ฌํ•ด๋ฅผ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•์ด์—ˆ๋‹ค.
    Optimization์€ wide A์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ๋‹ค.
    Wide, ์ฆ‰ underdetermined A๋Š” ์กฐ๊ฑด์ด ๋ถ€์กฑํ•ด์„œ ํ•ด์˜ ํ›„๋ณด๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์€๋ฐ ๊ทธ ์ค‘์—์„œ ๊ฐ€์žฅ ๊ดœ์ฐฎ์€ ๊ฒƒ์„ ์ฐพ๋Š” ๊ฒƒ์ด ์ตœ์ ํ™”๋‹ค.
  • Optimization ๋ฌธ์ œ๋„ ๊ฒฐ๊ตญ ๋˜๋‹ค์‹œ ๋์—” Linear System์„ ํ’€๊ฒŒ ๋  ๊ฒƒ์ด๋‹ค.
  • Optimization์„ ํ’€๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋‘ ๊ฐ€์ง€๊ฐ€ ํ•„์š”ํ•˜๋‹ค.
    1. ๋ชฉ์ ํ•จ์ˆ˜
    2. ์ œ์•ฝ์กฐ๊ฑด

    form of optimization problem

    Find x that minimizes a (cost) function f(x)f(x) subject to g(x)=0g(x)=0 or h(x)โ‰ค0h(x)\leq 0

Cost Function

๋ฌผ๋ก  ๋ฌธ์ œ๋ฅผ ์„ค๊ณ„ํ•˜๊ธฐ ๋‚˜๋ฆ„์ด์ง€๋งŒ.. ์ผ๋‹จ optimization ๋ฌธ์ œ๊ฐ€ ์ž˜ ์„ค๊ณ„๋˜์—ˆ๊ณ  ์šฐ๋ฆฌ๋Š” ๊ทธ๊ฑธ ํ‘ธ๋Š” ์ž…์žฅ์ด๋ผ๊ณ  ๊ฐ€์ •ํ•œ๋‹ค.
์šฐ๋ฆฌ๋Š” ์–ด๋–ค ff ๊ฐ’์„ minimizing ํ•˜๊ณ  ์‹ถ๋‹ค. ์ด ff๋ฅผ ๋ชฉ์ ํ•จ์ˆ˜, cost function์ด๋ผ๊ณ  ํ•œ๋‹ค.

๋‚ด๊ฐ€ ๊ด€์‹ฌ์žˆ๋Š” ๋ถ„์•ผ์—์„œ๋Š” ๋ณดํ†ต cost function์ด error๋ฅผ ์˜๋ฏธํ•˜๊ณ  ๋‹น์—ฐํžˆ ์ด๊ฒƒ์„ minimize ํ•˜๊ณ  ์‹ถ์–ดํ•œ๋‹ค.
์•„๋‹ˆ ๊ฒฝ์ œ ์ด๋Ÿฐ ์ชฝ์—์„œ๋Š” ์–ด๋–ค ๊ฐ’์ด ๊ฐ€์žฅ ํฌ๊ธธ ์›ํ•˜๊ธฐ๋„ ํ•˜๋Š”๋ฐ ์™œ minimize๋ผ๊ณ ๋งŒ ํ•˜๋ƒ๊ณ  ํ•œ๋‹ค๋ฉด ์‚ฌ์‹ค ff๋ฅผ minimizeํ•˜๋Š” ๊ฒƒ๊ณผ maximizeํ•˜๋Š” ๊ฒƒ์€ ๊ฐ™์€ ์ผ์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. min of f(x)f(x) =max of โˆ’f(x)-f(x)์ด๋‹ค. โˆ’f(x)-f(x)๋ฅผ minimizeํ•˜๊ณ  ๋‹ค์‹œ ์Œ์ˆ˜๋ฅผ ์ทจํ•ด ๋˜๋Œ๋ฆฌ๋ฉด ๋˜๋‹ˆ ์šฐ๋ฆฌ๋Š” minimize๋งŒ ๊ณ ๋ คํ•˜๊ณ  ๊ณต๋ถ€ํ•˜์ž.

Constriant

์œ„์—์„œ g(x)=0g(x)=0๊ฐ€ equality constraint์ด๊ณ  h(x)โ‰ค0h(x)\leq 0๊ฐ€ inequality constraint์ด๋‹ค.
์ œ์•ฝ์กฐ๊ฑด์ด๋ž€ ์‚ฌ์‹ค ๋‚˜์˜ ํ•ด๊ฐ€ ์กด์žฌํ•  ์ˆ˜ ์žˆ๋Š” ๋ฒ”์œ„๋ฅผ ์ขํ˜€์ฃผ๋Š” ๋…€์„์ด๋‹ค. ๊ทธ๋ž˜์„œ ๋‚˜์˜ candidiate๋ผ๊ณ  ์—ฌ๊ธธ ์ˆ˜๋„ ์žˆ๋‹ค. ๊ทธ ์•ˆ์—์„œ ํ•ด๊ฐ€ ๋‚˜์˜ค๋‹ˆ๊นŒ.

์‚ฌ์‹ค์€ equality constraint๋Š” ์—†๋‹ค๊ณ  ๋ณด๋ฉด ๋œ๋‹ค. machine precision ๋•Œ๋ฌธ์ด๋‹ค. ์ปดํ“จํ„ฐ๋“  ๊ณ„์‚ฐ๊ธฐ๋“  ๊ธฐ๊ณ„๋ฅผ ์ด์šฉํ•ด ํ’€ ๋•Œ ์ •ํ™•ํžˆ equality๋ฅผ ๋งŒ์กฑ์‹œํ‚ค๋Š” ๊ฒƒ์€ ๊ฑฐ์˜ ๋ถˆ๊ฐ€๋Šฅํ•ด์„œ ์ข์€ ๋ฒ”์œ„๋กœ inequality ์กฐ๊ฑด์„ ๋‘๋Š” ์‹์œผ๋กœ ์šฐํšŒํ•œ๋‹ค.

Problem of Optimization

optimization์ด ๊ณง ์–ด๋–ค ์ œ์•ฝ์กฐ๊ฑด์„ ๊ฐ€์ง€๊ณ  cost function์„ minimizeํ•˜๋Š” ๊ฒƒ์ด๋‹ค.
์›ํ•˜๋Š” ๊ฒƒ์€ ๊ทธ global min f(x)f(x)๊ฐ’๊ณผ ๊ทธ ๋•Œ์˜ xx.

์ด ์ƒํ™ฉ์—์„œ ์ด๊ณผ ์ˆ˜๋Šฅ์„ ๋ณธ ์šฐ๋ฆฌ๋Š” ์ผ, ์ด์ฐจ ๋ฏธ๋ถ„์„ ์ƒ๊ฐํ•œ๋‹ค. ๋‹น์—ฐํžˆ fโ€ฒ(x)=0,f^\prime(x)=0, fโ€ฒโ€ฒ(x)>0f^{\prime\prime}(x)>0์„ ์ด์šฉํ•˜๋ฉด ๋œ๋‹ค.

๋ฌธ์ œ๋Š” ์ด๋ ‡๊ฒŒ ์ฐพ์€ ๊ฒƒ์ด local min์ธ์ง€ global min์ธ์ง€ ๋ชจ๋ฅธ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

convex ํ•จ์ˆ˜๋Š” ํŠน์ •ํ•œ ์กฐ๊ฑด ํ•˜์—์„œ๋Š” local min์ด ๊ณง global min (์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ๋ฐ”๋กœ ๊ทธ point)์ด๋‹ค. ์ด๋Ÿฐ ๊ฒฝ์šฐ์—๋Š” ๋ฌธ์ œ๊ฐ€ ๊ต‰์žฅํžˆ ๋‹จ์ˆœํ•ด์ง€๊ธฐ ๋•Œ๋ฌธ์— optimization ๋ฌธ์ œ๋ฅผ convex ํ•จ์ˆ˜๋กœ ๋งŒ๋“ค๊ณ  ์‹ถ์–ด ํ•˜๊ณ  ๊ทธ๋ž˜์„œ ํ’€๊ธฐ ์‰ฝ๋„๋ก convexizeํ•˜๊ธฐ๋„ ํ•œ๋‹ค.


iterative method๋ฅผ ์“ธ ๋•Œ global optimal point๋ฅผ ์ฐพ๋Š” ๋ณด์žฅ๋œ ๋ฐฉ๋ฒ•์€ ์—†๋‹ค. iterative method๋Š” ํ•ญ์ƒ iterate๋ฅผ ์‹œ์ž‘ํ•  starting point๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ๊ทธ starting point๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ ์žก์•„์„œ ํ•ด๋ณด๊ณ  ๊ทธ์ค‘์— ์ œ์ผ ๋‚˜์€ ๊ฒƒ์„ ๊ณ ๋ฅด๋Š” ๋ฐฉ๋ฒ•๋„ ์žˆ๊ณ  ๊ทธ๋ ‡๋‹ค.

Two Methods for Optimization

  1. Iterative Method
  2. Direct Method
    using calculus by solving โˆ‡f(x)=0\nabla f(x) =0

๐ŸHow to Minimize
์ด์ œ ff๋ฅผ minimize ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ์ง‘์ค‘ํ•ด๋ณด์ž!

Unconstrained Minimization

์—ฌ๊ธฐ์—์„œ๋Š” ์ œ์•ฝ์กฐ๊ฑด์ด ์—†๋Š” (์ œ์•ฝ์กฐ๊ฑด์ด xโˆˆRnx\in \mathbb R^n ์ธ) unconstrained์— ๋Œ€ํ•ด์„œ ๋‹ค๋ฃจ๊ฒ ๋‹ค.


Gradient Descent

ํ•จ์ˆ˜์˜ ์ผ์ฐจ ๋ฏธ๋ถ„์ด 0์ด ๋˜๋Š” ์ ์„ critical point ๋˜๋Š” stationary point๋ผ๊ณ  ํ•œ๋‹ค. ํ•จ์ˆ˜์˜ ๊ทน์ , saddle point๊ฐ€ critical point์— ํ•ด๋‹น๋œ๋‹ค.
ํ•จ์ˆ˜์˜ ์ด์ฐจ๋ฏธ๋ถ„์œผ๋กœ์จ ๊ทธ์˜ ๊ณก๋ฃ”์„ ๋‚˜ํƒ€๋‚ด๋Š” Hessian์€ critical point์˜ ์ข…๋ฅ˜๋ฅผ ํŒ๋ณ„ํ•˜๋Š” ๋ฐ ์ด์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

์–ด๋–ค multi variable scalar valued function์˜ optimizeํ•˜๋Š” ๋ฌธ์ œ ์–ด๋–ค ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•œ ๋‹ค์Œ ์ตœ์†Œ๊ฐ’์„ ์ฐพ๋Š” ๊ฒƒ์ด ๋ชฉ์ ์ด๋‹ค.
๋ณดํ†ต์˜ ๊ฒฝ์šฐ ํ•จ์ˆ˜์˜ ์ผ์ฐจ ๋ฏธ๋ถ„(gradient)๊ฐ€ 0์ด ๋˜๋Š” ์ง€์ ์„ ์ฐพ๊ณ  ๊ทธ ์ง€์ ์ด min/max/saddle point ์ค‘์— ๋ฌด์—‡์ธ์ง€๋ฅผ ์•Œ์•„์•ผ ํ•˜๋Š”๋ฐ ๊ทธ๋•Œ Hessian์ด ์ด์šฉ๋œ๋‹ค.

Finding Optimal Point

  1. Find ciritical point
    x:โˆ‡f(x)=0x: \nabla f(x)=0

  2. H(x)H(x) eigen value evaluation on that xx
    all positive eigen values : local min
    all negative eigen values : local max
    positive and negative : saddle point

Hessian์˜ eigen vector๋Š” ๊ณก๋ฅ ์ด ๊ฐ€์žฅ ํฐ ๋ฐฉํ–ฅ์œผ๋กœ์˜ ๋ฒกํ„ฐ์ด๊ณ  eigen value๋Š” ์ด ๊ณก๋ฅ ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.
Hessian์€ symmetric matrix์ด๊ธฐ ๋•Œ๋ฌธ์— ํ•ญ์ƒ ๊ณ ์œ ๊ฐ’ ๋ถ„ํ•ด๊ฐ€ ๊ฐ€๋Šฅํ•˜๊ณ  eigen value๊ฐ€ ์กด์žฌํ•œ๋‹ค.

Hessian์€ ๊ณ ๋“ฑํ•™๊ต๋•Œ ๊ณต๋ถ€ํ–ˆ๋˜ ๋ฐ”๋กœ ๊ทธ ์ด์ฐจ๋ฏธ๋ถ„์ด๋‹ค. ์–ด๋Š ์ ์—์„œ ๊ทธ ์ฃผ๋ณ€์˜ ๊ณก๋ฅ ์„ ๋ณด๋Š” ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— ๊ธฐ๋ณธ์ ์œผ๋กœ max, min ์ด๋ผ๊ณ  ๋งํ•ด๋„ ํ•ญ์ƒ local max, local min์„ ๋งํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์ฐพ๊ณ ์‹ถ์€ ์ตœ์†Œ๊ฐ’์€ global min ์ธ๋ฐ ๋„์ค‘์— ์ฐพ์€ ๊ฒƒ์ด local min์ธ์ง€ global min์ธ์ง€ ๊ตฌ๋ถ„ํ•˜๊ธฐ ํž˜๋“  ๊ฒƒ์ด ๋ฌธ์ œ์ด๋‹ค.

convex ํ•จ์ˆ˜๋ผ๋Š” ํŠน์ •ํ•œ ์กฐ๊ฑด ํ•˜์—์„œ๋Š” local min์ด ๊ณง global min (์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ๋ฐ”๋กœ ๊ทธ point)์ด๋‹ค. ์ด๋Ÿฐ ๊ฒฝ์šฐ์—๋Š” ๋ฌธ์ œ๊ฐ€ ๊ต‰์žฅํžˆ ๋‹จ์ˆœํ•ด์ง€๊ธฐ ๋•Œ๋ฌธ์— optimization ๋ฌธ์ œ๋ฅผ convex ํ•จ์ˆ˜๋กœ ๋งŒ๋“ค๊ณ  ์‹ถ์–ด ํ•˜๊ณ  ๊ทธ๋ž˜์„œ ํ’€๊ธฐ ์‰ฝ๋„๋ก convexizeํ•˜๊ธฐ๋„ ํ•œ๋‹ค. ์—ฌ๊ธฐ์—์„œ ๊นŠ๊ฒŒ ๋‹ค๋ฃจ์ง€๋Š” ์•Š๊ฒ ์ง€๋งŒ ์ด๋Ÿฐ ๊ฒŒ ์žˆ๋‹ค.


โ—๏ธ~ : ~ ๋ผ๋Š” ๋ฌธ๊ตฌ๋Š” "๋’ค์˜ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” ์•ž์˜ ๊ฒƒ"์ด๋ผ๊ณ  ์ฝ์œผ๋ฉด ๋œ๋‹คโ•

์ฐธ๊ณ ์ž๋ฃŒ:
์ธํ•˜๋Œ€ ๊น€๊ด‘๊ธฐ ๊ต์ˆ˜๋‹˜์˜ ์ˆ˜์น˜ํ•ด์„ ๊ฐ•์˜
https://darkpgmr.tistory.com/132
https://angeloyeo.github.io/2020/06/17/Hessian.html

profile
๐Ÿ™ƒSUJI KIM๐Ÿ™ƒ ๐Ÿšฉ Inha University Undergraduate ๐Ÿš— Autonomous Driving Robots ๐Ÿ“ท Computer Vision ๐Ÿ’ซ SLAM

0๊ฐœ์˜ ๋Œ“๊ธ€