Faster R-CNN阅读笔记


proposals are the test-time computational bottleneck in state-of-the-art detection systems.
fast R-CNN在测试时几乎达到了实时的运行时间,所以候选框提取成了检测系统中的时间瓶颈。


computing proposals with a deep convolutional neural network—leads to an elegant and effective solution where proposal computation is nearly cost-free given the detection network’s computation.we introduce novel Region Proposal Networks (RPNs) that share convolutional layers with state-of-the-art object detection networks.

Faster R-CNN结构

The entire system is a single, unified network for object detection.
A Region Proposal Network (RPN) takes an image (of any size) as input and outputs a set of rectangular object proposals, each with an objectness score.
文章中一个比较重要的概念是anchor,也就是用于参考的候选框。简单来说就是对于输入网络的一张图片,经过多层卷积后得到大小为n*n的特征图,在特征图的每个点上都定义9个Anchor,最后按比例映射回输入图像上,就是参考候选框的位置。Faster RCNN阅读笔记

  • Loss Function
    We assign a positive label to two kinds of anchors: (i) the anchor/anchors with the highest Intersection-over- Union (IoU) overlap with a ground-truth box, or (ii) an anchor that has an IoU overlap higher than 0.7 with any ground-truth box.
    We assign a negative label to a non-positive anchor if its IoU ratio is lower than 0.3 for all ground-truth boxes.
    正样例:(1)与真实框的交并比最大 (2)与任意框的交并比大于0.7
    负样例: 与所有真实框的交并比都小于0.3

  • Training RPNs
    It is possible to optimize for the loss functions of all anchors, but this will bias towards negative samples as they are dominate. Instead, we randomly sample 256 anchors in an image to compute the loss function of a mini-batch, where the sampled positive and negative anchors have a ratio of up to 1:1. If there are fewer than 128 positive samples in an image, we pad the mini-batch with negative ones.

  • Sharing Features for RPN and Fast R-CNN