《Dynamic Filter Networks》阅读记录
Motivation
In a traditional convolutional layer, the learned filters stay fixed after training.
So this paper propose the Dynamic Filter Network, where filters are generated dynamically conditioned on an input. This architecture is a powerful one, with increased flexibility thanks to its adaptive nature, yet without an excessive increase in the number of model parameters.
Architecture
The dynamic filter network(module) consists of two parts: a filter-generating network and a dynamic filtering layer. As shown in figure 1:
Filter-Generating Network
输入向量维度: H * W * Ci ,即图像维度
输出向量维度: s * s * Co * n * d ,即输出filter的总维度,其中,s * s * Co是每个filter的维度,n是filter的总个数,当d为1时,相当于动态卷积,当d为H * W时,相当于动态local convolution。
Dynamic Filtering Layer
Dynamic convolutional layer (d = 1)
感觉这个用的多一点,其实跟普通的卷积层差不多,只是普通的卷积核的参数是固定的,而这里的动态卷积核的参数是sample-specific的。
其实可以更泛化一点,核的形状甚至可以是横条,竖条等等。
Dynamic local filtering layer (d = H * W)
对输入的每一个位置都单独生成一套filter,此时不仅是sample-specific,还position-specific。生成的filter应用在以对应位置为中心的一片区域上。
Key points
1、Make clear the distinction between model parameters and dynamically generated parameters
Model parameters contain the layer paramters that are initialized in advance and the parameters of the filter-generating network. They are the same for all samples.
Dynamically generated parameters means the parameters that the filter-generating network produces on-the-fly without a need for initialization. They are different from sample to sample or from position to position.
2、Dynamic的概念
用神经网络来生成某一结构的权重,这个想法貌似已经出来很久了。
细化下去有迁移学习,知识蒸馏。
也有这篇文章里提到的动态生成卷积核,还有直接动态生成transformer和conv-layer的。