Cartoon Image Processing: A Survey 페이퍼에서 정리한 캐릭터 생성 시 사용하는 loss들을 다시 정리
GAN에서 사용하는 loss들이고 실제로는 더 많이 있지만 for caractor에 대한 loss들이다.
1. Typical Loss Function
Pixel-level Loss
L 1 = ∑ i , j n ∣ y ( i , j ) − G ( x ) ( i , j ) ∣ ( 1 ) \mathcal L_1 = \sum_{i,j}^n |y^{(i,j)}-G(x)^{(i,j)}| \qquad (1) L 1 = i , j ∑ n ∣ y ( i , j ) − G ( x ) ( i , j ) ∣ ( 1 )
L 2 = ∑ i , j n ( y ( i , j ) − G ( x ) ( i , j ) ) 2 ( 2 ) \mathcal L_2 = \sum_{i,j}^n (y^{(i,j)}-G(x)^{(i,j)})^2 \qquad (2) L 2 = i , j ∑ n ( y ( i , j ) − G ( x ) ( i , j ) ) 2 ( 2 )
두 이미지를 element-wise로 측정한 것. L1과 비교해 L2는 더 큰 에러에 민감하고 작은 에러에 너그러움. 따라서 결과가 더 부드럽다. 가장 널리 사용되는 loss function.
y y y : real sample image
G ( x ) G(x) G ( x ) : generated sample
Total Variation Loss
L t v = ( G ( x ) ( i , j + 1 ) − G ( x ) ( i , j ) 2 ) + ( G ( x ) ( i + 1 , j ) − G ( x ) ( i , j ) 2 ) ( 3 ) \mathcal L_{tv} = \sqrt{({G(x)^{(i, j+1)}-G(x)^{(i, j)}}^2)+ ({G(x)^{(i+1, j)}-G(x)^{(i, j)}}^2)} \qquad (3) L t v = ( G ( x ) ( i , j + 1 ) − G ( x ) ( i , j ) 2 ) + ( G ( x ) ( i + 1 , j ) − G ( x ) ( i , j ) 2 ) ( 3 )
generated image에 spatial(공간적) smoothness를 부과한다. salt-pepper와 같은 high-frequency noises를 감소 시켜준다. 주위 픽셀을 비교하여 합을 내어 정의하고, 이미지에 얼마나 노이즈가 있는지 측정한다.
Concept / Perceptual Loss
L c o n t = ∑ i = 1 n ( ϕ l ( y ) − ϕ l ( G ( x ) ) ( 4 ) \mathcal L_{cont} = \sqrt{\sum_{i=1}^n (\phi^{l}(y)-\phi^{l}(G(x))} \qquad (4) L c o n t = i = 1 ∑ n ( ϕ l ( y ) − ϕ l ( G ( x ) ) ( 4 )
ϕ \phi ϕ 를 사용한 이미지와 input image의 semantic content 사이의 mean square error
→ 두 이미지 feature map간의 MSE
ϕ \phi ϕ : pretrained image classification network
l l l : layer
Adversarial Loss
L a d v = E y ∼ P d a t a ( B ) [ log D ( y ) ] + E x ∼ P d a t a ( a ) [ log 1 − D ( G ( x ) ) ] ( 5 ) \mathcal L_{adv} = \mathbb E_{y \sim P_{data}(B)}[\log D(y)] + \mathbb E_{x \sim P_{data}(a)}[\log 1- D(G(x))] \qquad (5) L a d v = E y ∼ P d a t a ( B ) [ log D ( y ) ] + E x ∼ P d a t a ( a ) [ log 1 − D ( G ( x ) ) ] ( 5 )
L c o v L_{cov} L c o v 의 경우 image space에 explicit한 제한이 없으면 생성된 이미지는 서로 다른 부분에 일관성이 없는 경향이 있으며 일반적으로 작은 line segment를 포함한다.
y ∼ P d a t a ( B ) y \sim P_{data}(B) y ∼ P d a t a ( B ) 와 x ∼ P d a t a ( A ) x \sim P_{data}(A) x ∼ P d a t a ( A ) 는 data distribution
y y y : real sample
G ( x ) G(x) G ( x ) : generate sample, G G G 는 domain B B B 와 동일한 이미지를 만드려고 함
D D D : y y y 와 G ( x ) G(x) G ( x ) 사이의 합성된 샘플을 구별하는 것
Cycle consistency Loss
L c y c = E y ∼ P d a t a ( A ) [ ∥ F ( G ( x ) ) − x ∥ 1 ] + E y ∼ P d a t a ( B ) [ ∥ G ( F ( x ) ) − x ∥ 1 ] ( 6 ) \mathcal L_{cyc} = \mathbb E_{y \sim P_{data}(A)}[ \lVert F(G(x))-x \rVert_1] + \mathbb E_{y \sim P_{data}(B)}[ \lVert G(F(x))-x \rVert_1] \qquad (6) L c y c = E y ∼ P d a t a ( A ) [ ∥ F ( G ( x ) ) − x ∥ 1 ] + E y ∼ P d a t a ( B ) [ ∥ G ( F ( x ) ) − x ∥ 1 ] ( 6 )
L a d v L_{adv} L a d v 는 각 input x x x 에 대한 output y y y 를 연결할 함수를 보장할 수 없기 때문에 L c y c L_{cyc} L c y c 는 input과 output으로부터 solution을 매핑한다. 각 이미지 x x x 는 domain A A A 로부터 오고, 이미지는 반복되어(cycle) translation되어 원본 이미지고 된다.
x → G ( x ) → F ( G ( x ) ) ≈ x x \to G(x) \to F(G(x)) \approx x x → G ( x ) → F ( G ( x ) ) ≈ x
G : A → B G:A \to B G : A → B and F : B → A F:B \to A F : B → A
Style Loss
L s t y l e = ∑ λ ∥ σ ( f λ ( G ( x ) ) ) − σ ( f λ ( x ) ) ∥ 2 + ∑ λ ∥ μ ( f λ ( G ( x ) ) ) − μ ( f λ ( x ) ) ∥ 2 ( 7 ) \mathcal L_{style} = \sum_{\lambda} \lVert \sigma(f_{\lambda}(G(x)))- \sigma (f_{\lambda}(x))\rVert_2 + \sum_{\lambda} \lVert \mu(f_{\lambda}(G(x)))- \mu(f_{\lambda}(x))\rVert_2 \qquad (7) L s t y l e = λ ∑ ∥ σ ( f λ ( G ( x ) ) ) − σ ( f λ ( x ) ) ∥ 2 + λ ∑ ∥ μ ( f λ ( G ( x ) ) ) − μ ( f λ ( x ) ) ∥ 2 ( 7 )
AdaIN이 mean과 style features의 standard deviation만으로 transfer를 할 수 있게 된 이후로 나온 수식
σ ( x ) , μ ( x ) \sigma(x), \mu(x) σ ( x ) , μ ( x ) 는 입력 x x x 의 channel-wise variance와 mean
f λ ( x ) f_\lambda(x) f λ ( x ) : x x x 에 상응하는 λ \lambda λ -th째 layers feature
2. Loss Functions Specially Designed for Cartoon
Surface Loss
Learning to Cartoonize Using White-box Cartoon Representations(2020)
L s u r f a c e = log D s ( F d g f ( I c , I c ) + log ( 1 − D s ( F d g f ( G ( I p ) , G ( I p ) ) ) ) ( 8 ) \mathcal L_{surface} = \log D_s(\mathcal F_{dgf}(I_c, I_c) + \log (1-D_s(\mathcal F_{dgf}(G(I_p), G(I_p)))) \qquad (8) L s u r f a c e = log D s ( F d g f ( I c , I c ) + log ( 1 − D s ( F d g f ( G ( I p ) , G ( I p ) ) ) ) ( 8 )
cartoon은 coarse brush에서 rough brush로 smooth surface를 만들어서 cartoon 이미지와 비슷하게 함.
global semantic structure를 유지하면서 이미지를 smooth시키기 위해서 F d g f F_{dgf} F d g f 를 적용
입력으로 I I I 를 받으면 스스로 guide map을 가지고 extracted surface representation F d g f ( I , I ) \mathcal F_{dgf}(I,I) F d g f ( I , I ) 를 반환함.
I I I : input
I c I_c I c : input cartoon image
I p I_p I p : input photo image
F d g f ( I , I ) \mathcal F_{dgf}(I,I) F d g f ( I , I ) : extracted surface representation, texture와 detail을 제거함
D D D 는 모델의 output과 cartoon 이미지의 surface가 비슷한가 판단하고 G G G 가 이미지를 잘 만들 수 있도록 가이드.
Structure Loss
Learning to Cartoonize Using White-box Cartoon Representations(2020)
L s t r u c t u r e = ∥ V G G n ( F ( I p ) ) − V G G n ( F s t ( G ( I p ) ) ) ∥ ( 9 ) \mathcal L_{structure} = \lVert VGG_n(F(I_p))-VGG_n(\mathcal F_{st}(G(I_p))) \rVert \qquad (9) L s t r u c t u r e = ∥ V G G n ( F ( I p ) ) − V G G n ( F s t ( G ( I p ) ) ) ∥ ( 9 )
pretrained VGG16으로 high-level features를 추출. 결과와 추출한 representation 사이에 공간적 제약을 가한다.
F s t F_{st} F s t : structure representation extraction
비현실적인 만화에서 global content, sparse color blocks, clear boundaries를 모방하는 구조 추출
Texture Loss
Learning to Cartoonize Using White-box Cartoon Representations(2020)
F r c s ( I r g b ) = ( 1 − α ) ( β 1 ∗ I r + β 2 ∗ I g + β 3 ∗ I b ) + α ∗ Y ( 10 ) \mathcal F_{rcs}(I_{rgb})=(1-\alpha)(\beta_1*I_r+\beta_2*I_g+\beta_3*I_b)+\alpha*Y \qquad (10) F r c s ( I r g b ) = ( 1 − α ) ( β 1 ∗ I r + β 2 ∗ I g + β 3 ∗ I b ) + α ∗ Y ( 1 0 )
color와 luminance(반사되는 빛의 양)의 영향은 줄이고 high-quality texture만 가짐
F r c s \mathcal F_{rcs} F r c s : single-channel texture representation from color image
random color shift algorithm
I r g b I_{rgb} I r g b : I r I_r I r , I g I_g I g , I b I_b I b 3개의 color channels
Y Y Y : rgb image로부터 변환된 grayscale image
논문에서는 α = 0.8 \alpha = 0.8 α = 0 . 8 , β 1 , β 2 , β 3 ∼ U ( − 1 , 1 ) \beta_1, \beta_2, \beta_3 \sim U(-1,1) β 1 , β 2 , β 3 ∼ U ( − 1 , 1 ) 로 설정
L t e x t u r e = log D t ( F r c s ( I c ) + log ( 1 − D s ( F r c s ( G ( I p ) ) ) ) ( 11 ) L_{texture} = \log D_t(\mathcal F_{rcs}(I_c) + \log (1-D_s(\mathcal F_{rcs}(G(I_p)))) \qquad (11) L t e x t u r e = log D t ( F r c s ( I c ) + log ( 1 − D s ( F r c s ( G ( I p ) ) ) ) ( 1 1 )
D D D 는 model output과 reference cartoon 이미지 로부터 추출된 표현 사이의 텍스텨 구별
clear contours to fine textures를 학습해서 generator 가이드
Domain-Adversarial Loss
XGAN(2020)
L d a n n = E P d a t a ( A ) l ( A , c d a n n ( e A ( x ) ) ) + E P d a t a ( B ) l ( B , c d a n n ( e B ( x ) ) ) ( 12 ) \mathcal L_{dann} = \mathbb E_{P_{data}(A)^{l}}(A, c_{dann}(e_A(x))) +\mathbb E_{P_{data}(B)^{l}}(B, c_{dann}(e_B(x))) \qquad (12) L d a n n = E P d a t a ( A ) l ( A , c d a n n ( e A ( x ) ) ) + E P d a t a ( B ) l ( B , c d a n n ( e B ( x ) ) ) ( 1 2 )
같은 subspace로 나눠 domain A, B로부터 임베딩하고 semantic level에서 domain gap을 구함
→ c d a n n c_{dann} c d a n n 을 훈련하기 위해
c d a n n c_{dann} c d a n n : a binary classifier
encoder e A , e B e_A, e_B e A , e B 는 domain-adversarial classifier의 confuse를 최소화함으로 classification 정확도를 최대화
l l l : classification loss function
Semantic Consistency Loss
XGAN(2020)
L s e m = E x ∼ P d a t a ( A ) ∥ e A ( x ) − e B ( G ( x ) ) ∥ + E y ∼ P d a t a ( B ) ∥ e B ( y ) − e A ( F ( y ) ) ∥ ( 13 ) \mathcal L_{sem} = \mathbb E_{x \sim P_{data}(A)}\lVert e_A(x)-e_B(G(x)) \rVert + \mathbb E_{y \sim P_{data}(B)} \lVert e_B(y)-e_A(F(y)) \rVert \qquad (13) L s e m = E x ∼ P d a t a ( A ) ∥ e A ( x ) − e B ( G ( x ) ) ∥ + E y ∼ P d a t a ( B ) ∥ e B ( y ) − e A ( F ( y ) ) ∥ ( 1 3 )
입력 sementic을 domain translation 하는 것
⇒ 예: x ∈ D A x \in \mathcal D_A x ∈ D A 의 입력 sementic을 다른 도메인 G ( x ) ∈ D B G(x) \in \mathcal D_B G ( x ) ∈ D B 로 translated (혹은 반대로도.)
이 consistency property는 paired data가 없고 sub-optimal 이미지 비교에서 부적절하므로 pixel-loss에서 적용하기 어려움. 대신에 feature-level semantic consistancy loss를 사용한다.
→ domain translation하는 동안에 embedding을 학습하고 네트워크를 보존한다.
∥ · ∥ denotes a distance between vectors
Landmark Consistency Loss
CycleGAN(2019)
L l a n d = ∥ R B ( G ( A , L ) → Y ( x , l ) ) − l ∥ 2 ( 14 ) \mathcal L_{land} = \rVert R_B (G_{(A,L) →Y (x,l)}) − l \lVert_2 \qquad (14) L l a n d = ∥ R B ( G ( A , L ) → Y ( x , l ) ) − l ∥ 2 ( 1 4 )
L L L : input landmark heatmap (l ∈ L l \in L l ∈ L )
R R R : pretrained U-Net, landmark regressor with 5-channel output for respective domain
Identity Loss
U-GAT-IT(2019)
L i d e = E x ∼ P d a t a ( A ) ∥ x − F ( x ) ∥ 1 + E y ∼ P d a t a ( B ) ∥ y − G ( y ) ∥ 1 ( 15 ) \mathcal L_{ide} = \mathbb E_{x \sim P_{data}(A)}\lVert x-F(x) \rVert_1 + \mathbb E_{y \sim P_{data}(B)} \lVert y-G(y) \rVert_1 \qquad (15) L i d e = E x ∼ P d a t a ( A ) ∥ x − F ( x ) ∥ 1 + E y ∼ P d a t a ( B ) ∥ y − G ( y ) ∥ 1 ( 1 5 )
입력 이미지에 대한 color distribution을 보장하기 위함
CAM Loss (Class Activation Map)
CAM(2016), U-GAT-IT(2019)
L C A M A → B = − ( E x ∼ P d a t a ( A ) [ log ( η A ( x ) ) ] + E x ∼ P d a t a ( B ) [ 1 − log ( η B ( y ) ) ] ) ( 16 ) L_{CAM}^{A \to B} = -(\mathbb E_{x \sim P_{data}(A)}[\log (\eta_A(x))] + \mathbb E_{x \sim P_{data}(B)}[1- \log (\eta_B(y))] ) \qquad (16) L C A M A → B = − ( E x ∼ P d a t a ( A ) [ log ( η A ( x ) ) ] + E x ∼ P d a t a ( B ) [ 1 − log ( η B ( y ) ) ] ) ( 1 6 )
L C A M D = − ( E y ∼ P d a t a ( B ) [ ( η D ( y ) ) 2 ] + E x ∼ P d a t a ( A ) [ 1 − η D G ( x ) ) 2 ] ) ( 18 ) L_{CAM}^D = -(\mathbb E_{y \sim P_{data}(B)}[(\eta_D(y))^2] + \mathbb E_{x \sim P_{data}(A)}[1-\eta_D G(x))^2] ) \qquad (18) L C A M D = − ( E y ∼ P d a t a ( B ) [ ( η D ( y ) ) 2 ] + E x ∼ P d a t a ( A ) [ 1 − η D G ( x ) ) 2 ] ) ( 1 8 )
CNN의 global average pooling사용
η A , η D \eta_A, \eta_D η A , η D : auxiliary classifiers
y ∼ P d a t a ( B ) y \sim P_{data}(B) y ∼ P d a t a ( B ) or x ∼ P d a t a ( A ) x \sim P_{data}(A) x ∼ P d a t a ( A )
Attribute Matching Loss
StyleCariGAN(2019)
L a t t r p → c = − E w ∼ W [ ϕ p ( G p ( w ) ) log ϕ c ( G p → c ( w ) ) + ( 1 − ϕ p ( G p ( w ) ) log ( 1 − ϕ c ( G p → c ( w ) ) ) ] ) ( 19 ) L_{attr}^{p \to c} = -\mathbb E_{w \sim \mathcal W}[\phi_p(G_p(w)) \log \phi_c (G_{p \to c}(w)) \\+ (1-\phi_p(G_p(w)) \log (1-\phi_c(G_{p \to c}(w)))] ) \qquad (19) L a t t r p → c = − E w ∼ W [ ϕ p ( G p ( w ) ) log ϕ c ( G p → c ( w ) ) + ( 1 − ϕ p ( G p ( w ) ) log ( 1 − ϕ c ( G p → c ( w ) ) ) ] ) ( 1 9 )
L a t t r c → p = − E w ∼ W [ ϕ p ( G c ( w ) ) log ϕ p ( G c → p ( w ) ) + ( 1 − ϕ c ( G c ( w ) ) log ( 1 − ϕ p ( G c → p ( w ) ) ) ] ) ( 20 ) L_{attr}^{c \to p} = -\mathbb E_{w \sim \mathcal W}[\phi_p(G_c(w)) \log \phi_p (G_{c \to p}(w)) \\ + (1-\phi_c(G_c(w)) \log (1-\phi_p(G_{c \to p}(w)))] ) \qquad (20) L a t t r c → p = − E w ∼ W [ ϕ p ( G c ( w ) ) log ϕ p ( G c → p ( w ) ) + ( 1 − ϕ c ( G c ( w ) ) log ( 1 − ϕ p ( G c → p ( w ) ) ) ] ) ( 2 0 )
L a t t r = L a t t r p → c + L a t t r c → p ( 21 ) L_{attr} = L_{attr}^{p \to c} + L_{attr}^{c \to p} \qquad (21) L a t t r = L a t t r p → c + L a t t r c → p ( 2 1 )
photo와 caricatures의 facial attribute classifiers
photo와 caricatures의 binary cross entropy loss로 정의되어 있음
ϕ \phi ϕ : attribute classfier
G G G : styleGAN
G p → c G_{p \to c} G p → c : p2c-StylrCariGAN
G c → p G_{c \to p} G c → p : c2p-StylrCariGAN
Charateristic Loss
CariGAN(2018)
L c h a B ( G ) = E x ∼ P d a t a ( A ) [ 1 − c o s ( x − P d a t a ( A ) ‾ , G ( x ) − P d a t a ( B ) ‾ ) ] \mathcal L_{cha}^{B}(G) = \mathbb E_{x \sim P_{data(A)}}[1-cos(x-\overline{P_{data(A)}}, G(x) - \overline{P_{data(B)}})] L c h a B ( G ) = E x ∼ P d a t a ( A ) [ 1 − c o s ( x − P d a t a ( A ) , G ( x ) − P d a t a ( B ) ) ]
face와 평균 face의 차이가 독특한 캐리커쳐의 특징을 나타내기 때문에 과장된 후에도 얼굴 특징을 유지시켜야 하는 근본적인 아이디어를 배경으로 제안됨
P d a t a ( A ) ‾ \overline{P_{data(A)}} P d a t a ( A ) : P d a t a ( A ) P_{data(A)} P d a t a ( A ) 의 평균 (B의 경우에도 마찬가지)
reverse direction L c h a A ( F ) \mathcal L _{cha}^A(F) L c h a A ( F ) 에 대해서도 비슷한 정의
Smoothness Regularization Loss
AutoToon (2020)
L r e g = ∑ i , j ∈ F ^ ( 2 − < F ^ i , j − 1 , F ^ i , j > ∥ F ^ i , j − 1 ∥ ∥ F ^ i , j ∥ − < F ^ i − 1 , j , F ^ i , j > ∥ F ^ i − 1 , j ∥ ∥ F ^ i , j ∥ ) \mathcal L_{reg}=\sum_{i,j\in \hat F}(2- {<\hat F_{i, j-1}, \hat F_{i,j}> \over \lVert \hat F_{i,j-1} \rVert \lVert \hat F_{i,j} \rVert} - {<\hat F_{i-1, j}, \hat F_{i,j}> \over \lVert \hat F_{i-1,j} \rVert \lVert \hat F_{i,j} \rVert}) L r e g = i , j ∈ F ^ ∑ ( 2 − ∥ F ^ i , j − 1 ∥ ∥ F ^ i , j ∥ < F ^ i , j − 1 , F ^ i , j > − ∥ F ^ i − 1 , j ∥ ∥ F ^ i , j ∥ < F ^ i − 1 , j , F ^ i , j > )
field를 smooth하게 warping하기 위한 cosine similarity
<> : dot product
F ^ \hat F F ^ : warping field
APDrawingGAN(2019)
d C M ( x 1 , x 2 ) = ∑ ( j , k ) ∈ Θ b ( x 1 ) I D T ( x 2 ) ( j , k ) + ∑ ( j , k ) ∈ Θ w ( x 1 ) I D T ′ ( x 2 ) ( j , k ) d_{CM}(x_1, x_2) = \sum_{(j,k) \in \Theta_b(x_1)} I_{DT}(x_2)(j,k) + \sum_{(j,k) \in \Theta_w(x_1)} I'_{DT}(x_2)(j,k) d C M ( x 1 , x 2 ) = ( j , k ) ∈ Θ b ( x 1 ) ∑ I D T ( x 2 ) ( j , k ) + ( j , k ) ∈ Θ w ( x 1 ) ∑ I D T ′ ( x 2 ) ( j , k )
레퍼런스
[1] Zhao, Y., Ren, D., Chen, Y., Jia, W., Wang, R., & Liu, X. (2022). Cartoon Image Processing: A Survey. International Journal of Computer Vision, 130(11), 2733-2769.
[2] XGAN (2020) : Royer, A., Bousmalis, K., Gouws, S., Bertsch, F., Mosseri, I., Cole, F., & Murphy, K. (2020). Xgan: Unsupervised image-to-image translation for many-to-many mappings. In Domain Adaptation for Visual Understanding (pp. 33-49). Cham: Springer International Publishing.
[3] CycleGAN (2019) : Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2223-2232).
[4] U-GAT-IT(2019) : Kim, J., Kim, M., Kang, H., & Lee, K. H. (2019, September). U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation. In International Conference on Learning Representations.
[5] CAM(2016) : Wang, C., Xiao, J., Han, Y., Yang, Q., Song, S., & Huang, G. (2021). CAM-loss: Towards Learning Spatially Discriminative Feature Representations. arXiv preprint arXiv:2109.01359.
[6] StyleCariGAN (2019) : Jang, W., Ju, G., Jung, Y., Yang, J., Tong, X., & Lee, S. (2021). StyleCariGAN: caricature generation via StyleGAN feature map modulation. ACM Transactions on Graphics (TOG), 40(4), 1-16.
[7] CariGAN (2018) : Li, W., Xiong, W., Liao, H., Huo, J., Gao, Y., & Luo, J. (2020). CariGAN: Caricature generation through weakly paired adversarial learning. Neural Networks, 132, 66-74.
[8] AutoToon (2020) : Gong, J., Hold-Geoffroy, Y., & Lu, J. (2020). Autotoon: Automatic geometric warping for face cartoon generation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 360-369).
[9] APDrawingGAN (2019) : Yi, R., Liu, Y. J., Lai, Y. K., & Rosin, P. L. (2019). Apdrawinggan: Generating artistic portrait drawings from face photos with hierarchical gans. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10743-10752).
레퍼런스는 편의상 위에서 언급한 모델명을 같이 하이라이트 함