Precise detection of Chinese characters in historical documents with deep reinforcement learning

发表于 Pattern Recognition 2020

Here, we use this method for precise character detection by making tight bounding boxes around the Chinese characters in historical documents. An agent is trained to learn the control policy of fine-tuning a bounding box step-by-step through a Markov Decision Process.
We introduce a novel fully convolutional network with position-sensitive Region-of-Interest (RoI) pooling (FCPN). The network receives character patches as input without fixed size, and it can fuse position information into the fea- tures of actions. Besides, we propose a dense reward function (DRF) that provides excellent rewards according to different actions and environment states, improving the decision-making ability of the agent.

4)我们将对Dueling DQN [11],Double DQN [12]和优先体验重播[13]方法的优点结合在一起,以简单有效的DQN变体训练代理。我们提出的精度检测方法在TKH和MTH数据集上均优于最新方法,在IoU 0.8准则下具有显着改进。

Method :
如图蓝色虚线部分所示,以原始粗略检测到的汉字区域w * h为输入,基本主干特征提取器由两个残差块组成,每个残差块均由三个卷积层组成。受Dueling网络[11]想法的启发,在backbone输出的末尾,精心设计了两个流,通过使用位置敏感的RoI池分别估计状态值和每个动作的优势[21]整合行为的位置信息。
目标检测论文Precise detection of Chinese characters in historical documents with DRL提出的具有位置敏感RoI池的全卷积网络的详细结构,k,s,p分别是内核,步幅和填充大小;红色圆角矩形中的w,h,c,s和g分别表示pooling宽度,高度,输出通道,空间大小和组大小。
目标检测论文Precise detection of Chinese characters in historical documents with DRL窗口中的箭头表示移动方向。第五动作表示停止。
目标检测论文Precise detection of Chinese characters in historical documents with DRL提升效果:F-measure
目标检测论文Precise detection of Chinese characters in historical documents with DRL这个方法相当于一个损失函数,把已有方法的粗略检测结果作为输入,用深度强化学习进行坐标微调,更接近真值。