风格迁移

用户1145562

发布于 2020-10-23 12:03:19

7980

发布于 2020-10-23 12:03:19

文章被收录于专栏：LC刷题

先看一下迁移后的图片

传送门

VGG19

一些是运行中打印出的网络结构代码，与上图完全对应。

LAYER GROUP 1 #卷积层组1
# 下面有两个卷积层，一个池化层，relu为线性整流层，每次卷积后，都relu一下
--conv1_1 | shape=(1, 663, 1000, 64) | weights_shape=(3, 3, 3, 64)
--relu1_1 | shape=(1, 663, 1000, 64) | bias_shape=(64,)
--conv1_2 | shape=(1, 663, 1000, 64) | weights_shape=(3, 3, 64, 64)
--relu1_2 | shape=(1, 663, 1000, 64) | bias_shape=(64,)
--pool1   | shape=(1, 332, 500, 64)

LAYER GROUP 2 #卷积层组2
# 下面有两个卷积层，一个池化层
--conv2_1 | shape=(1, 332, 500, 128) | weights_shape=(3, 3, 64, 128)
--relu2_1 | shape=(1, 332, 500, 128) | bias_shape=(128,)
--conv2_2 | shape=(1, 332, 500, 128) | weights_shape=(3, 3, 128, 128)
--relu2_2 | shape=(1, 332, 500, 128) | bias_shape=(128,)
--pool2   | shape=(1, 166, 250, 128)

LAYER GROUP 3 # 卷积层组3
# 下面有四个卷积层，一个池化层
--conv3_1 | shape=(1, 166, 250, 256) | weights_shape=(3, 3, 128, 256)
--relu3_1 | shape=(1, 166, 250, 256) | bias_shape=(256,)
--conv3_2 | shape=(1, 166, 250, 256) | weights_shape=(3, 3, 256, 256)
--relu3_2 | shape=(1, 166, 250, 256) | bias_shape=(256,)
--conv3_3 | shape=(1, 166, 250, 256) | weights_shape=(3, 3, 256, 256)
--relu3_3 | shape=(1, 166, 250, 256) | bias_shape=(256,)
--conv3_4 | shape=(1, 166, 250, 256) | weights_shape=(3, 3, 256, 256)
--relu3_4 | shape=(1, 166, 250, 256) | bias_shape=(256,)
--pool3   | shape=(1, 83, 125, 256)

LAYER GROUP 4 # 卷积层组4
# 下面有四个卷积层，一个池化层
--conv4_1 | shape=(1, 83, 125, 512) | weights_shape=(3, 3, 256, 512)
--relu4_1 | shape=(1, 83, 125, 512) | bias_shape=(512,)
--conv4_2 | shape=(1, 83, 125, 512) | weights_shape=(3, 3, 512, 512)
--relu4_2 | shape=(1, 83, 125, 512) | bias_shape=(512,)
--conv4_3 | shape=(1, 83, 125, 512) | weights_shape=(3, 3, 512, 512)
--relu4_3 | shape=(1, 83, 125, 512) | bias_shape=(512,)
--conv4_4 | shape=(1, 83, 125, 512) | weights_shape=(3, 3, 512, 512)
--relu4_4 | shape=(1, 83, 125, 512) | bias_shape=(512,)
--pool4   | shape=(1, 42, 63, 512)

LAYER GROUP 5 # 卷积层组5
# 下面有四个卷积层，一个池化层
--conv5_1 | shape=(1, 42, 63, 512) | weights_shape=(3, 3, 512, 512)
--relu5_1 | shape=(1, 42, 63, 512) | bias_shape=(512,)
--conv5_2 | shape=(1, 42, 63, 512) | weights_shape=(3, 3, 512, 512)
--relu5_2 | shape=(1, 42, 63, 512) | bias_shape=(512,)
--conv5_3 | shape=(1, 42, 63, 512) | weights_shape=(3, 3, 512, 512)
--relu5_3 | shape=(1, 42, 63, 512) | bias_shape=(512,)
--conv5_4 | shape=(1, 42, 63, 512) | weights_shape=(3, 3, 512, 512)
--relu5_4 | shape=(1, 42, 63, 512) | bias_shape=(512,)
--pool5   | shape=(1, 21, 32, 512)

vgg 本身还是一个卷积神经网络（CNN）(详细介绍)，卷积神经网络由输入层、卷积层、激活函数、池化层、全连接层组成，即INPUT（输入层）-CONV（卷积层）-RELU（激活函数）-POOL（池化层）-FC（全连接层）。

vgg19在卷积☛池化部分做了扩充修改。

由图及code，可以观察到，vgg19一共有五个卷积层组（conv layer），卷积层使用的卷积核均为3×3卷积核，三个全连接层（FC layer）。

共计一共19个隐藏层，其中16个卷积层，1个池化层。

VGG优点: VGGNet的结构非常简洁，整个网络都使用了同样大小的卷积核尺寸（3x3）和池化尺寸（2x2）。几个小滤波器（3x3）卷积层的组合比一个大滤波器（5x5或7x7）卷积层好：验证了通过不断加深网络结构可以提升性能。
VGG缺点: VGG耗费更多计算资源，并且使用了更多的参数（这里不是3x3卷积的锅），导致更多的内存占用。

其中绝大多数的参数都是来自于第一个全连接层。VGG可是有3个全连接层

论文阅读

原始content图像，用\vec{p}表示，即最开始输入图像内容。
生成图像，用\vec{x}表示，即迁移学习过程中生成的图像。
原始style图像，用\vec{a}表示，即输入风格图像。
N_l：在第l个网络层中的feature map数
M_l：在第l个网络层中的feature map大小，即feature map的长宽乘积。
F^l：图像在第l个网络层的所有特征图组成的矩阵。
F^l_{ij}：原content图像在第l个网络层的F^l在第i个filter，位置j处的激活。
A^l_{ij}：原style图像在第l个网络层的A^l在第i个filter，位置j处的激活。
P^l_{ij}：同F^l_{ij}，P表示生成过程中图像。
定义误差损失函数L_{content}(\vec{p},\vec{x},l) = \frac{1}{2}\sum\limits_{i,j}(F_{ij}^l-P_{ij}^l)^2
G_{il}^l = \sum\limits_{k}F_{ik}^lF_{kj}^l，表示在第l层，feature map i与feature map j的内积。
E_l=\frac{1}{4N_l^2M_l^2}\sum\limits_{i,j}(G_{ij}^l-A_{ij}^l)^2，生成图像与原始style图像在第l层的均方损失。
L_{style}(\vec{a},\vec{x})=\sum\limits_{l=0}^Lw_lE_l，每一层生成的图像\vec{x}与原始style图像的总损失。
L_{total} = {\alpha}L_{content}(\vec{p},\vec{x})+{\beta}L_{style}(\vec{a},\vec{x})

未写其反向传播过程，不太理解他那张图。待填坑。

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2020-04-142，如有侵权请联系 cloudcommunity@tencent.com 删除

卷积神经网络

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

卷积神经网络

登录后参与评论

0 条评论

热度

风格迁移

风格迁移

先看一下迁移后的图片

VGG19

论文阅读

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐