【论文笔记】(HEVC)《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》

《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》阅读笔记

Peng-Ren Lai, Jia-Shung Wang Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan {ex0487, jswangtaiwan}@gmail.com

abstract

they proposed a Multi-stage Attention Convlutional Neural NetWork(MACNN).

three characteristics:

a 3-stage design fashion with various loss criteria

The 3-stage design clarifies the function of each stage, effectively alleviating the burden of the network。应该就是深层网络，增加了非线性映射，让网络可以更精确的训练。
the revised Inception module and self-attention block

benefit our model having the ability to acquire global information with fewer layers. 这两种结构可以用更少的层数得到全局信息。

MACNN achieves an average 6.5% BD-rate reduction compared to HEVC in all-intra configuration

activation

大多数目前的工作的损失函数都只是MSE， It is not robust to outliers and usually generates over-smoothed results [13]

Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. “Loss Functions for Image Restoration With Neural Networks,” IEEE Transactions On Computational Imaging, Vol. 3, No. 1, pp. 47-57, 2017.

这篇论文里提出L2 范数与人们主观感知的图像质量的相关性较差。因为L2范数的使用涉及到了一些假设，即噪声的影响独立于局部图像的特征，相反人类视觉系统对噪声来说依赖于局部亮度，对比度和结构。L2一般在高斯白噪声下有效。
先前的结构简单的采用残差块来实现深层网络。尽管提高了质量，但过多的网络层使得模型很难学习，有更多的参数和计算复杂度。因此 long-range dependencies are difficult to be learned.？不知道这里的长期依赖是什么意思。， Zhang et al. [7] proposed self-attention GAN。

[7] Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. “Self-Attention Generative Adversarial Networks,” arXiv preprint arXiv:1805.08318, 2018.

利用self-attention to capture long-range dependencies， Inception module [6] resulting with fewer layers。

network

在这里插入图片描述

Projection stage

Partiition Information: CUmap and TUmap ,they are normalized to 0 and 1;

划分信息和输入分别经过3x3的卷积处理后，concatention 64x64x128,经过一个1x1的卷积，产生64x64x64的融合特征，1x1卷积可以学习保留全部信息的同时降低特征图数量。同时由于cumap和特征映射和输入的特征映射的特征关系，concat后的1x1卷积可以强调CU，TU边界。

Deblocking stage

鉴于VRCNN的有效性，考虑了一个变尺度滤波器。修改了Inception block，并使用了PRelu。

使用了2个inception block，每个block的输出同时送入到下一阶段和一个1x1的卷积。1x1的卷积用来对该层的特征图进行重构。

每一层的重构输出使用MSE做损失函数，最终的损失函数使用每一层的损失函数的组合。这个结构其实相当于我先训练前边的网络，训练完，对这个网络的输出再训练一个小网络，依次递推。

（PS：以前有过类似的想法，对网络分开训练，第一个网络的输出作为第二个网络的训练集进行训练。不过这篇论文把分开训练的过程联合起来了）

Refinement stage

采用了一个self-attention和3个Inception block结构进一步降低伪影。自注意力模块可以对特征图的任意两个位置之间的相关性进行建模，而不考虑空间距离。

Loss function
【论文笔记】(HEVC)《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》

Lprt用来计算块效应的loss，即CUTU边界处的损失值。

deblocking stage loss function：
【论文笔记】(HEVC)《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》

Sobel loss 可以求出图像边缘部分的了L1 loss 为了验证振铃效应，模糊效应。
【论文笔记】(HEVC)《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》

G : Sobel filter

【论文笔记】(HEVC)《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》

advantages

划分3个阶段，分步骤减少伪影，每个步骤的功能更加清晰。
使用inception block ：searching representations of variable-size blocks of quantization errors by variable filter sizes
使用self-attention block：capture long-range dependencies。利用自注意力模块在较浅层生成带有全局视角的特征图。有2点好处：利用更少的层获得足够大小全局信息，利用自注意力模块，将多尺度引入网络。

experiment

based HM16.0，AI mode

traindata ：CPIH-Intra database

训练patch 64x64，adam optimizers ，batch16

【论文笔记】(HEVC)《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》

差的部分是因为训练集不同，模型有数据依赖。

Class C,Class D好，是因为低分辨率一个CTU能包含更多的物体信息，因此子注意力机制的长期依赖（大感受野？）更有意义。

conclude

提出了一个分阶段降噪多loss组合的浅层网络。

【论文笔记】(HEVC)《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》

《Multi-stage Attention Convolutional Neural Networks for HEVC In-Loop Filtering》阅读笔记

abstract

activation

network

advantages

experiment

conclude

相关推荐