๐Ÿ“ฆ LimitNet: Progressive, Content-Aware Image Offloading for Extremely Weak Devices & Networks

Bardยท2025๋…„ 6์›” 25์ผ

RTCL

๋ชฉ๋ก ๋ณด๊ธฐ
15/15
post-thumbnail

Introduction

  • ํ’€๊ณ ์ž ํ•˜๋Š” ๋ฌธ์ œ๋Š” Progressive Neural Compression์—์„œ์˜ ๋ฌธ์ œ์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.
  • ์ €์ „๋ ฅ ๋ฌด์„  ๋„คํŠธ์›Œํฌ (LPWAN) ํ™˜๊ฒฝ์—์„œ, ์ด๋ฏธ์ง€ ์ „์†ก์„ ์œ„ํ•œ ์ œํ•œ๋œ ๋Œ€์—ญํญ๊ณผ ๋†’์€ ์ง€์—ฐ์„ ๊ณ ๋ คํ•˜์—ฌ, ์ด๋ฏธ์ง€์˜ ์ค‘์š”ํ•œ ํŠน์ง•์„ ์šฐ์„ ์ ์œผ๋กœ ์ „์†กํ•˜๊ณ , ๋œ ์ค‘์š”ํ•œ ํŠน์ง•์€ ๋‚˜์ค‘์— ์ „์†กํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

Contributions

  • Lightweight saliency detector์™€ ์ƒˆ๋กœ์šด Gradual Scoring ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ†ตํ•ฉํ•œ progressive content-aware encoder์ธ LimitNet์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.
  • LimitNet์€ ๋‹จ 15K ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์ด๋ฃจ์–ด์ง„ ๋งค์šฐ ๊ฐ€๋ฒผ์šด ์ด๋ฏธ์ง€ ์ธ์ฝ”๋”์ด๋ฉฐ, ARM Cortex-M ์‹œ๋ฆฌ์ฆˆ์™€ ๊ฐ™์€ ๊ทน๋„๋กœ ์ œํ•œ๋œ ๋””๋ฐ”์ด์Šค์—์„œ๋„ ์‹คํ–‰๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ARM Cortex-M33๊ณผ M7์—์„œ classification๊ณผ object detection์— ๋Œ€ํ•œ ์„ฑ๋Šฅ์„ ์ธก์ •ํ•˜๊ณ , RAM, Flash, CPU ์‚ฌ์šฉ๋Ÿ‰, ์ „๋ ฅ ์‚ฌ์šฉ๋Ÿ‰ ๋“ฑ์„ ํ‰๊ฐ€ํ•˜์—ฌ SOTA์™€ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค.

LimitNet

  • Encoder
  • Saliency Detection
  • Gradual Scoring
  • Offloading, Decoder and Classifier

Lightweight Encoder

  • ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๊ฐ€๋ฒผ์šด encoder ENCฮธENCENC_{\theta_{ENC}}์™€ ํฌ๊ณ  ๊นŠ์€ decoder DECฮธDECDEC_{\theta_{DEC}}๋กœ ์ด๋ฃจ์–ด์ง„ ๋น„๋Œ€์นญ autoencoder๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
XCร—Hร—Wโ†’ENCฮธENC(XCร—Hร—W)โ†’ZLร—Kร—K(1)X^{C \times H \times W} \to \text{ENC}_{\theta_{\text{ENC}}}(X^{C \times H \times W}) \to Z^{L \times K \times K}\tag{1}

Saliency Detection

  • Saliency detection์€ ์ด๋ฏธ์ง€์—์„œ ์ค‘์š”ํ•œ ํŠน์ง•์„ ์‹๋ณ„ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋Ÿฌ๋‚˜ embedded ๋””๋ฐ”์ด์Šค์—์„œ๋Š” ROI detection, explainable ai, saliency detection๊ฐ™์€ ๋ณต์žกํ•œ ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
  • ๋”ฐ๋ผ์„œ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด ๋ณต์žกํ•œ saliency detector์˜ output์„ ํ‰๋‚ด๋‚ด๋Š” ๊ฐ€๋ฒผ์šด ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • 4๋ ˆ์ด์–ด๋กœ ์ด๋ฃจ์–ด์ง„ SalDetฮธSalDetSalDet_{\theta_{SalDet}}๋ฅผ ์‚ฌ์šฉํ•˜๊ณ , saliency detection model์˜ SOTA ์ค‘ ํ•˜๋‚˜์ธ BASNet๋กœ ์ง€์‹์ฆ๋ฅ˜๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด ๊ฐ€์ง€๋Š” ์ž ์žฌ ํ…์„œ ZLร—Kร—KZ^{L \times K \times K}๋ฅผ ์ž…๋ ฅ๋ฐ›์•„, saliency map IKร—KI^{K\times K}๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.
    ZLร—Kร—Kโ†’SalDetฮธSalDet(ZLร—Kร—K)โ†’IKร—K(2)Z^{L \times K \times K} \rightarrow \text{SalDet}_{\theta_{\text{SalDet}}}(Z^{L \times K \times K}) \rightarrow I^{K \times K} \tag{2}
  • ์ด saliency detector๋Š” ์˜ค์ง 5K ํŒŒ๋ผ๋ฏธํ„ฐ๋งŒ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋Š” ์›๋ž˜ saliency detection model์˜ 0.001%์— ๋ถˆ๊ณผํ•ฉ๋‹ˆ๋‹ค.

Gradual Scoring: The Devil is NOT in the Details

  • Saliency based scoring์˜ ๊ฒฝ์šฐ ์ค‘์š”ํ•œ ๋ถ€๋ถ„(Foreground)์—๋งŒ ์ง‘์ค‘ํ•˜์—ฌ ๋จผ์ € ๋ณด๋‚ด์ง€๊ธฐ ๋•Œ๋ฌธ์—, decoder๋ฅผ ์œ„ํ•œ contextual information์ด ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ด์šฉํ•˜๊ธฐ ์œ„ํ•ด Gradual Scoring์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.
  • Gradual Scoring์„ ์ ์šฉํ•œ ๊ฒฝ์šฐ a3 ๊ธฐ์ค€์œผ๋กœ 40%p๋‚˜ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
IKร—Kโ†’GS(IKร—K)โ†’SLร—Kร—K(3)I^{K \times K} \rightarrow \text{GS}(I^{K \times K}) \rightarrow S^{L \times K \times K} \tag{3}

Gradual Scoring (Cont'd)

  • Gradual Scoring mechanism์€ GFactorG_{Factor}๋ฅผ ๋„์ž…ํ•˜์—ฌ, ์ ์  ๊ฐ์†Œํ•˜๋Š” ์ ์ˆ˜๋ฅผ saliency map์— ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.
Si,[0:K],[0:K]=๐ผ[0:K],[0:K]+GFactorร—i,โˆ€iโˆˆ{0,1,2,...,L}โŠค4S_{i,[0:K],[0:K]} = ๐ผ_{[0:K],[0:K]} + G_{Factor} ร— i, \quad โˆ€i โˆˆ \{0, 1, 2, . . . , L\} \top{4}
  • ์ด๋Š” ํ•™์Šต ์‹œ์—๋„ ์ ์šฉ๋˜๋ฉฐ, 0์—์„œ 100์‚ฌ์ด ๋žœ๋คํ•œ pp๋ฅผ ๋ฝ‘๊ณ , top-pp%๋งŒํผ์„ ์ œ์™ธํ•œ latent data๋ฅผ 0์œผ๋กœ ์ฑ„์›Œ๋„ฃ์Šต๋‹ˆ๋‹ค.
ZLร—Kร—K,SLร—Kร—Kโ†’DroppingZโ€ฒLร—Kร—K(5)Z^{L \times K \times K}, S^{L \times K \times K} \xrightarrow{\text{Dropping}} Z'^{L \times K \times K} \tag{5}
Zi,j,kโ€ฒ={Zi,j,kย ifย Si,j,kโ‰ฅpthย largestย valueย ofย SLร—Kร—K0otherwise(6)Z'_{i,j,k} = \begin{cases} Z_{i,j,k} & \text{ if } S_{i,j,k} \geq p^{\text{th}}\text{ largest value of } S^{L \times K \times K} \\ 0 & \text{otherwise} \end{cases} \tag{6}


Offloading, Decoder and Classifier

  • ์ธ์ฝ”๋”ฉ ํ›„, latent data๋ฅผ 6๋น„ํŠธ๋กœ, saliency map์€ 5๋น„ํŠธ๋กœ ์–‘์žํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ทธ๋ฆฌ๊ณ , saliency map์„ encoded data๋ฅผ ๋ณด๋‚ด๊ธฐ์ „ ๋จผ์ € ๋ณด๋ƒ…๋‹ˆ๋‹ค. (์ด map์€ ์ตœ๋Œ€ 40B๋กœ, ์˜ค๋ฒ„ํ—ค๋“œ๋Š” ๋ฌด์‹œํ• ๋งŒ ํ•ฉ๋‹ˆ๋‹ค.)
  • ๊ทธ๋ฆฌ๊ณ , decoder์™€ classifier๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ์ˆ˜ํ–‰๋ฉ๋‹ˆ๋‹ค.
Zโ€ฒLร—Kร—Kโ†’Quant.,ย Huffmanย Enc.Z^Lร—Kร—K(7)Z'^{L \times K \times K} \xrightarrow{\text{Quant., Huffman Enc.}} \hat{Z}^{L \times K \times K} \tag{7}
Z^โ€ฒLร—Kร—Kโ†’Lat.ย Rec.Z^Lร—Kร—Kโ†’DECฮธDECX^Cร—Hร—Wโ†’CLSฮธCLSy^(8)\hat{Z}'^{L \times K \times K} \xrightarrow{\text{Lat. Rec.}} \hat{Z}^{L \times K \times K} \xrightarrow{\text{DEC}}_{\theta_{\text{DEC}}} \hat{X}^{C \times H \times W} \xrightarrow{\text{CLS}}_{\theta_{\text{CLS}}} \hat{y} \tag{8}

ย 



Training phases of LimitNet

PhaseEpochsLRLoss FunctionTraining Notes
11000.001Rec Loss + Saliency Loss์ง€์‹ ์ฆ๋ฅ˜ + Gradual Scoring
260.00005CLS LossCLS๋ฅผ ๊ณ ์ • + Gradual Scoring

Evaluation

Benchmarks

  • Accuracy vs Data Size
  • Saliency Detection Evaluation
  • System-level Benchmarks on MCU
    • Flash, RAM ์‚ฌ์šฉ๋Ÿ‰, ์‹คํ–‰์‹œ๊ฐ„, ์†Œ๋ชจ ์ „๋ ฅ
  • NOT: MS-SSIM and PSNR

Comparaing GFLOPs and #parameters

Model#GFLOPs#Params
LimitNet0.00415K
Starfish0.7778K
Ballรฉ et al.1.952.5M

Accuracy vs Data Size


Evaluating LimitNet in Detail


Saliency Detection Evaluation


Resource requirements and inference time


profile
๋ˆ ๋˜๋Š” ๊ฑด ๋‹ค ๊ณต๋ถ€ํ•ฉ๋‹ˆ๋‹ค.

0๊ฐœ์˜ ๋Œ“๊ธ€