CVPR2020超分辨方向文章总结(上)
1.Investigating Loss Functions for Extreme Super-Resolution
NTIRE2020极限超分亚军方案,出自CIPLab。LPIPS指标亚军,PI指标冠军。官方的评测指标是LPIPS,所以屈居亚军。
以往的超分辨方法主要是做4倍超分,很少一部分工作是16倍超分的。
problem
The general approach for perceptual ×4 SR is using GAN with VGG based perceptual loss, however, we found that it creates inconsistent details for perceptual ×16 SR
solution
Use GAN with LPIPS [23] loss for perceptual extreme SR.
Use U-net structure discriminator [14] together to consider both the global and local context of an input image.
2.Perceptual Extreme Super Resolution Network with Receptive Field Block
RFB-ESRGAN:NTIRE2020冠军方案RFB-ESRGAN,带感受野模块的超分网络。同上文一样,也是解决16倍超分问题
problem
Perceptual Extreme Super-Resolution for single image is extremely difficult, because the texture details of different images vary greatly.
NTIRE 2020 Perceptual Extreme Super-Resolution Challenge:
Difficulties:
- Develop a model that can effectively recover the finer details and textures of low resolution image, and make the results be both photo-realistic and with high perceptual quality.
- Minimize time complexity as much as possible while keep the satisfactory results at the same time.
solution
Use multi-scale Receptive Fields Block (RFB) in the generative network to restore the finer details and textures of the super-resolution image.
- RFB can extract different scale features from previous feature map, which means it can extract the coarse and fine features from input LR images.
- To reduce time complexity and still maintain satisfactory performance, RFB use several small kernels instead of large kernels, and we alternately use different upsampling methods in upsampling stage of the generative network.
- In the testing phase, we use model fusion to improve the robustness and stability of the model to different test images.
作者认为,RFB最大的功能是可以抽取非常精细的特征,这点对于图像重构是十分重要的。Receptive Fields Block (RFB)引入是文献[21]
Upsampleing phase,交替使用Nearest Neighborhood Interpolation (NNI) or Sub-pixel Convolution (SPC) [23],Use them alternately will improve the information communica- tion between space and depth.
3.Residual Feature Aggregation Network for Image Super-Resolution
RFANet:超分效果好于RCAN,网络结构进行优化,文中Discussions部分与MemNet,RDN要做区分,主要是:
1.Residual Feature Aggregation
2.加入了Enhanced Spatial Attention Block
工作感觉有一些水,可以用来参考
problem
As the network depth grows, the residual features gradually focused on different aspects of the input image, which is very useful for reconstructing the spatial details. However, existing methods neglect to fully utilize the hierarchical features on the residual branches.
solution
Propose a residual feature aggregation (RFA) framework, which aggregates the local residual features for more powerful feature representation.
To maximize the power of the RFA framework, we further propose an enhanced spatial attention (ESA) block to make the residual features to be more focused on critical spatial contents.
ESA比参考文献[10中的plain one更有作用
4. Real-World Super-Resolution via Kernel Estimation and Noise Injection
NTIRE 2020 Challenge on both tracks of Real-World Super- Resolution,NTIRE2020-RWSR超分双赛道冠军方案
problem
Existing methods always fail in real-world image super-resolution, since most of them adopt simple bicubic downsampling from high-quality images to construct Low-Resolution (LR) and High-Resolution (HR) pairs for training which may lose track of frequency-related details.
以往的方法,采用bicubic方法获取LR图像,这种方式是不合理的,这里的LR图像和原始HR图像,所处不同的domain。由于domain gap,对real-world图像进行超分,效果不佳。
因此,real-world super-resolution的核心问题:引入精确的退化模型。
solution
Focus on designing a novel degradation framework for real-world images by estimating various blur kernels as well as real noise distributions. Based on our novel degradation framework, we can acquire LR images sharing a common domain with real-world images. Then, we propose a real-world super-resolution model aiming at better perception.
discriminator:
观察到ESRGAN会引入许多artifacts,使用patch discriminator[17,50]代替ESRGAN中的VGG-128
5.PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
自监督;GAN;放大像素64倍(暂时是最高倍数);将生成HR图像对应的LR图像与原图(LR)对比,找到最接近的那张,并反推找到对应的HR图像
problem
In previous approaches, which have generally been supervised, the training objective typically measures a pixel-wise average distance between the super-resolved (SR) and HR images. Optimizing such metrics often leads to blurring, especially in high variance (detailed) regions.
solution
Present a novel super-resolution algorithm addressing this problem, PULSE (Photo Upsampling via Latent Space Explo- ration), which generates high-resolution, realistic images at resolutions previously unseen in the literature.
Instead of starting with the LR image and slowly adding detail, PULSE traverses the high-resolution natural image manifold, searching for images that downscale to the original LR image.This is formalized through the “down-scaling loss,” which guides exploration through the latent space of a generative model.
By leveraging properties of high-dimensional Gaussians, we restrict the search space to guarantee that our outputs are realistic.
文中评价其他方法:
SRGAN:基于MSE和GAN的解决方案,loss使用基于MSE的方法并额外加入其他损失项。在人脸超分领域,可以合并入FSRGAN。使用全监督方式进行训练,并不是无监督的生成模型。
StyleGAN:可以提供丰富的潜在空间以表现不同的特征,特别是面部特征。
downscaling loss文中参考[2]和 [25]