论文阅读:Text-to-Text Pre-Training for Data-to-Text Tasks

文章简介:

这篇文章内容很简单,作者对最近发布的 【1】T5模型应用data-to-text领域的数据进行微调,发现在WebNLG,MultiWoz,ToTTo三个数据集上都优于目前的性能;作者提出可能这种只采用预训练模型,不需要进行现在文本生成领域流行的规划生成【2】【3】,词约束和复制机制方法,同样也能取得比较好的文本生成效果;

主要内容:

预训练:

实验T5模型:Small (60 million parameters), Base (220 million), Large (770 million) and 3B (3 billion).

Finetune:5K step MultiWoz 和 WebNLG;10K steps for ToTTo
所有参数更新在finetune过程;

实验:

实验准备:

T5词表词数:32000个句子
lr:0.001
Decoding:greedy search
Metric:使用【4】sacrebleu计算BLEU

数据集:

1.MultiWoz:task oriented dialogue
2.ToTTo:tabke-to-text
3.WebNLG:graph-to-text

实验结果:

论文阅读:Text-to-Text Pre-Training for Data-to-Text Tasks
这篇文章应该是没完成版本,但是实验对比的模型都很新,值得一看,这里插个眼;

Reference:

【1】:Exploring the limitsof transfer learning with a unified text-to-text transformer
【2】:Neural data-to-text generation: A comparison between pipeline and end-to-end architectures.
【3】:Step-by-step: Separating planning from realization in neural data-to-text generation.
【4】:A call for clarity in reporting bleu scores.
【5】:Pipeline-Transformer:Neural data-to-text generation: A comparison between pipeline and end-to-end architectures.
【6】:Bridging the structural gap between encoding and decoding for data-to-text generation:GNN+lstm+atten+copy
【7】:Data-to-text generation with content selection and planning.:Content Planner
【8】:Get to the point: Summarization with pointer generator networks.:Pointer Generator,lstm+atten+copy
【9】:Leveraging pre-trained checkpoints for sequence generation tasks.:Bert-to-Bert
【10】:Handling divergent reference texts when
evaluating table-to-text generation.:.PARENT度量
【11】:Few-shot natural language generation for task-oriented dialog.:HDSA
【12】:Few-shot natural language generation for task-oriented dialog.:SC-GPT