A guide to convolution arithmetic for deep learning:卷积 + 反卷积 + 空洞卷积
1.1 离散卷积(conv)
i (output),k (kernel_size), s (stride), p (pooling), o (output)
1.2 池化(pooling)
i, k, s, o
2.卷积(conv)
2.1 no padding
o = ( i - k ) / s + 1 # 向下取整
2.2 pdding
o = ( i - k + 2*p ) / s +1 p = 2/k 或 p=k-1 # 向下取整
3.反卷积(deconv/transposed conv/ fractionally strided conv)
一般反卷积在decoding时使用,大多时候用于上采样,为了减少棋盘效应(伪影),k取偶数(例,k=4,s=2,deconv后特征长、宽变为原来2倍)
A guide to convolution arithmetic for deep learning
中以K=3为例,W矩阵实际运算时变为:
卷积:O = C * I (C: 4x16 I:4x4, reshape为 16x1 ,卷积得到O:4x1, rehshape 为 2x2 ) 4x4--------->2x2
反卷积:I = * O (转置卷积由来) (
: 16x4 O:2x2, rehsape成 4x1, I:16x1, reshape为4x4 ) 2x2------------>4x4
由上至下:conv :i, k, s ,p ,o
由下至上:deconv: ,
3.1 no padding
deconcv: s > 1 相当于 conv s < 1 , 故deconv 又称 fractionally strided conv
conv: i, k, s, p=0, o o = ( i - k ) / s + 1
对应deconv: ,
conv: i=5, k=3, s=2, p=0, o=(5-3)/2 + 1 = 2
对应deconv: ,
=2*(2-1) + 3 = 5
3.2 padding
conv: i, k, s, p, o o = ( i - k + 2*p ) / s +1
对应deconv: (1) ,
conv: i=5, k=3, s=2, p=1, o=(5-3+ 2*1)/2 + 1 = 3
对应deconv: ,
=2*(3-1) + 3- 2*1 = 5
(2) , a = (i + 2p − k),
conv: i=6, k=3, s=2, p=1, o=(6-3+ 2*1)/2 + 1 = 3
对应deconv: , a=1
=2*(3-1) +1+ 3- 2*1 = 6
4.膨胀卷积、空洞卷积(Dilated conv)
膨胀率 d
膨胀卷积核大小 = k + (d-1)* (k − 1)
输出大小 o=(i - +2p) /s +1 = (i + 2p - k - d* (k − 1) )
dilated conv : i = 7, k = 3, d = 2, s = 1 , p = 0
= 3 + ( 2-1) * ( 3 - 1 ) =7
o = (7 - 5 ) +1 = 3
参考博客:https://blog.****.net/cdknight_happy/article/details/78898791
参考文章: A guide to convolution arithmetic for deep learning