深度学习学习笔记(一)
1. 目标检测
1.1 两阶段
1. Fast RCNN:backbone + SS + ROIPooling + 非全局FCs
2. Fast RCNN => Faster RCNN:
(1)ss ···> RPN
(2)非全局FCs ···> 全局FCs
3. FCN + Faster RCNN => RFCN:
(1)ROIPooling ···> PSROIPooling
(2)FCs ···> Cov
4. RFCN => Light-head:
(1)k*k*(c+1) ···> k*k*10
(2)Cov ···> FCs
1.2 一阶段(YOLO)
YOLOV1
googlenet + FCs => 7x7x[2x(4 +1)+20]
V1 ···> V2
(1)k-means cluster得到anchor
(2)框回归:+sigmoid
(3)多尺度训练
(4)passthrough layer
(5)darknet 19
=> 输出参数量:13x13x[5x(4+1+20)]
V2 ···> V3
(1)darknet 53
(2)FPN:add···>concat
=> 框个数:3x(13x13+26x26+52x52)
=> 输出参数量:框个数x(4+1+80)
2. 文本检测
2.1 CTPN
backbone + BiLSTM + FCs => 输出
N*C*H*W ···> N*9C*H*W ···> (NH)*W*9C(reshape)···> N*256*H*W (LSTM)
(1)输出包含三部分:
bbox:2*k (只预测y,h;w固定16)
score:2*k
side-refinement:k (横向偏移量)
2.2 EAST
FPN(PAVNet)+ FCN
(1)输出包含两部分:
score map
bbox(4)+angle 或 bbox(8)
(2)两种预测格式:旋转矩形或四边形
2.3 CRNN
backbone + BiLSTM + OTO
(1)高固定为32,且pooling 3,4由2x2变为1x2
=> 32xW ···>1xw/4
3. Word Embedding
3.1 基于频数
(1)Count Vector
(2)tf-idf
(3)共现矩阵
3.2 基于预测
(1)skip-gram:由中间预测四周
(2)CBOW:由四周预测中间;词向量矩阵相加/N