《A Multilayer Fusion Light-Head Detector for SAR Ship Detection》

2019年3月5日发表于sensors，受 Li, Z.; Peng, C.; Yu, G.; Zhang, X.; Deng, Y.; Sun, J. Light-Head R-CNN: In Defense of Two-Stage Object Detector. arXiv 2017, arXiv:1711.07264.这篇文章的启发，针对多尺度目标检测提出多层融合的轻量化检测器（ multilayer fusion light-head detector ），是两阶段检测器，包括三个分支: backbone network(ResNet-101 ), region proposal subnetwork(fuses feature to detect multiscale sar ship), and light-head detection subnetwork( adapt light-head design with large-kernel separable convolution and position-sensitive pooling layer to improve the detection speed).
abstract:

Recently, with the excellent ability of feature representation, deep neural networks such as faster region based convolution neural network (FRCN) have shown great performance in object detection tasks.
FRCN缺点：固定感受野难以满足多尺度舰船目标，当目标很小时性能会降低；作为二阶段检测，计算量大且速度慢；复杂背景下，简单样本和困难样本的不均衡会导致高误检率。
本文设计 a multilayer fusion light-head detector (MFLHD) 。不使用单一的要素地图，而是将浅层高分辨率特征和深层语义特征相结合来生成区域建议。也就是特征融合。
提出使用大核的可分离卷积和位置敏感池化来提升检测速度。
采用focal loss降低虚警。

在文中有几句话值得学习：
Due to the different characteristics of aerial view [22], the variable size of objects, and complex background scenes, directly applying deep learning detection methods cannot exhibit good performance in SAR ship detection.

t the depth of CNNs is very important to improve the performance of feature representation，However, with increasing depth, the network is more difﬁcult to train for the reason of parameters explosion and gradient vanishing

主网络如下：《A Multilayer Fusion Light-Head Detector for SAR Ship Detection》

2.1 backbone network

这部分介绍了 ResNet-101 的网络结构，优点以及为什么要使用它作为基础网络

2.2. RPN Subnetwork

二阶段网络的第一阶段，简要介绍了generate candidate region proposal的发展，RPN的原理和优点

Multilayer Fusion

首先对检测器提出要求A good detector should be able to detect objects with a large range of scale
随后，受到参考文献
27.Cai, Z.; Fan, Q.; Feris, R.S.; Vasconcelos, N. A Uniﬁed Multi-scale Deep Convolutional Neural Network for Fast Object Detection. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2016. [CrossRef].
28.Cui, L. MDSSD: Multi-scale Deconvolutional Single Shot Detector for Small Objects. arXiv 2018, arXiv:1805.07009.
的启发，于是combined high/low level layers to get a fusion layer,具体结构如下
《A Multilayer Fusion Light-Head Detector for SAR Ship Detection》

Region Proposal Network

好像和Faster rcnn没区别

Loss Function

常用的，没做改变

2.3. Detection Subnetwork

二阶段网络的第二阶段
首先介绍了 FRCN and RFCN，提到了FRCN使用全连接的缺点：大量的参数会提升运算量。RFCN中使用1×1来增加通道数，采用saposition-sensitivepooling(PSRoIpooling)layer对每个roi做池化。这也是Faster rcnn的一个改进网络。
介绍完FRCN and RFCN后提出本文的算法改进，
先介绍为何从这个方面进行改进
Generally speaking, there are several approaches to simplify the model complexity such as reducing the number of channels and reducing the number of layers. Intheproposedmethod,wetakeadvantageoftheabovetwomethods
其次介绍改进的具体方法

Firstly,we replace plain convolution with a large-kernel separable convolution to produce a “thin” feature map. The number of channels, different from the RFCN subnetwork, depending on the number of classes, is a small ﬁxedvalue.
Then, we pool along each RoI and average vote the ﬁnal prediction.
Finally, a cheap single fully connected layer is attached to the pooling layer, which exploits the feature representation for classiﬁcation and regression.
这里没有一个改进后的总体结构图，不太清楚具体是怎么做到的

2.3.1. Large-Kernel Separable Convolution

文中介绍说可分离卷积被加到fusion layer中，但这个模块中都有什么？？？？
深度可分离卷积的意义this operation can keep the receptive ﬁeld and save the computational budget as n grows.
这里的输入是特征图

2.3.2. Position-Sensitive RoI-Pooling

介绍了为什么要使用它。 Position-sensitive score maps were proposed to address a dilemma between translation-invariance in the classiﬁcation stage and translation-variance in the detection stage.

这里留一个需要学习的地方，也就是深度可以分离卷积和位置敏感池化的基础理论学习。

Experiments and Results

3.1. Experimental Dataset and Settings

3.1.1 Evaluation Indicators

3.1.2 Evaluation Indicators

Ablation Study

这些实验主要是来验证基础网络和参数选择的合理性

3.2.1. The Inﬂuence of Backbone Network

3.2.2The Inﬂuence of Multilayer Fusion

layer selection has a great impact on the performance of the detection system.
通过实验证明选择哪些层来融合能达到最优效果

3.2.3. The Inﬂuence of Parameter γ in Focal loss

3.3. Comparison with Other Methods

In general,theproposedmethodgreatlyimproves the detection accuracy without losing too much detection speed.

这篇文章的优点很明显，相比于CFAR,能够对岸边舰船处理更好，相比于一阶段，在很少损失速度的情形下达到较高的准确度。相比于两阶段改进前的FRCN，速度有了明显的提升
缺点或者局限性是什么呢？？