Matching the Blanks: Distributional Similarity for Relation Learning

主要是两个contribution

对照试验：证明了在RE里面BERT使用entity marker 和 entity start的效果是最好的
训练方法：提出了一种和原始BERT类似的自监督任务训练模型，并且构造了对应的数据集

1. introduction

主要把当前的RE分为三类：

（distant）supervise：让模型学习一个映射
surface form：用浅层的表示来替代一种关系，其实就是一种tagging based
universal schema

其中suoervise或者，DS的方法学embeding来获取representation会有受限的缺点（embedding space受限），而这些方法的共同缺点则是需要label或者external knowledge。

Matching the Blanks: Distributional Similarity for Relation Learning

这篇文章就是利用BERT以及一个自监督任务–matching the blank来训练extractor

2. method

2.1 对照试验

input:

标准[CLS]：无法显式知晓input中的enitty，当有不止两个entity的时候模型没办法知道需要关注哪两个
position embeding：在两个entity的embedding上面concat position information
entity maker token:在两个entity的各自首尾处加上特殊的token来定位

output:

标准[CLS]:输出sentence leve embedding
entity mention pooling：在两个entity的word embedding上施加pooling，获得一个“entity embeding”，然后把两个entity embedding concat
entity start state:如果input是采用了entity maker token ，那么把每个entity的 “开始界定位” --start token的embedding concat.

Matching the Blanks: Distributional Similarity for Relation Learning

在相同的实验setting下，entity marker token和entiy start state的结合能让BERT更好地完成任务（因为BERT能够做到区分entity，并以学习entity representation为目标，这也是适配RE的，总之就是弥补了普通BERT explicit entity 的缺点）

2.2 matching the blanks training

目标很简单，二分类。对于corpus中的每一个relation state pair, 如果两个relation是相同的（positive）那么就让模型把他们预测更接近，negative亦然。loss定义如下（个人觉得这里有笔误?不应该是P(l=0|r,r’)?）
Matching the Blanks: Distributional Similarity for Relation Learning

但是仅仅就这个loss是没办法很好地训练模型，会存在entity link system（个人理解就是浅层线性线索），所以就仿照BERT的MLM，以alpha为概率对一个relation state中的每个entity进行mask，来防止模型过度依赖entity link来进行预测而忽略relation information的理解。

针对此，提出了一个matching blanks任务的训练集，主要是利用社交媒体数据集中很有可能出现一对entity经常出现一种relation的特征。

主表：

Matching the Blanks: Distributional Similarity for Relation Learning

Matching the Blanks: Distributional Similarity for Relation Learning

1. introduction

2. method

2.1 对照试验

input:

output:

2.2 matching the blanks training

相关推荐