[논문정리] A Neural Algorithm of Artistic Style

Eunjin Kim·2022년 4월 20일

논문

목록 보기

1/13

the system uses neural representations to separate and recombine content and style of arbitrary images

Along the CNN, the input image is transformed into representations (actual content of the image compared to its detailed pixel values)

The representations of content and style in the CNN are separable

Previous approaches(non-photorealistic rendering) mainly rely on non-parametric techniques to directly manipulate the pixel representation of an image.

By using DNN trained on object recognition, it carry out manipulations in feature spaces(high level of an image)

Perfoming a complex-cell would be a possibel way to obtain a content-independent representation of the appearance of a visual input.

Methods

It used the feature space provided by the 16 convolutional and 5 pooling layers of the VGG19 Network.(NO fully connected layers)

replace max-pooling to average pooling

Content Loss

$\overrightarrow{p}$ : original image
$\overrightarrow{x}$ : generated image
$P^l_{i,j}$ : the activation of the $i^{th}$ filter at position $j$ in layer $l$
$P^l$ , $F^l$ : respective feature representation in layer $l$

Gram matrix $G^l ∈ R^{N^l \times N^l}$

$G^l_{i, j}$ : inner product between the vectorised feature map $i$ and $j$ in layer $l$

Using gradient descent to find another image that matches the sytle representation of the original image.

Style loss

Minimising the mean-squared distance between the entries of the Gram matrix from the original image and the Gram matrix of the image to be generated.
$\overrightarrow{a}$ : original image
$\overrightarrow{x}$ : generated image
$A^l$ , $G^l$ : respective style representation in layer $l$

$w_l$ : weighting factors

Loss function

$\overrightarrow{p}$ : photograph
$\overrightarrow{a}$ : artwork

$\alpha$ : weighting factor for content reconstruction
$\beta$ : weighting factor for style reconstruction