EMNLP2020 | 近期必读Text Generation精选论文



导语:文本生成(Text Generation)是自然语言处理的一部分,从知识库或逻辑形式等等机器表述系统去生成自然语言。
根据AMiner-EMNLP2020词云图和论文可以看出,Text Generation在本次会议中也有许多不凡的工作,下面我们一起看看Text Generation主题的相关论文。
1.论文名称:POINTER: Constrained Text Generation via Insertion-based Generative Pre-training
作者:Zhang Yizhe, Wang Guoyin, Li Chunyuan, Gan Zhe, Brockett Chris, Dolan Bill
Real-world editorial assistant applications must often generate text under specified lexical constraints, for example, convert a meeting note with key phrases into a concrete meeting summary, recast a user-input search query as a fluent sentence, generate a conversational response using grounding facts, or create a story using a pre-specified set of keywords.
The authors have presented POINTER, a simple yet powerful approach to generating text from a given set of lexical constraints in a non-autoregressive manner.
The proposed method leverages a large-scale pre-trained model to generate text in a progressive manner using an insertion-based Transformer.
Both automatic and human evaluation demonstrate the effectiveness of POINTER and its potential in constrained text generation.
2.论文名称:Improving Text Generation with Student-Forcing Optimal Transport.
作者:Jianqiao Li, Chunyuan Li, Guoyin Wang, Hao Fu, Yuhchen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang, Dinghan Shen, Qian Yang
Natural language generation is an essential component of many NLP applications, such as machine translation, image captioning, text summarization, dialogue systems, and machine comprehension.
Generating human-like natural language is typically cast as predicting a sequence of consecutive words in a recurrent manner.
The authors have introduced SFOT to mitigate exposure bias in text generation. The proposed model captures positional and contextual information of word tokens in OT matching.
Experiments on neural machine translation, text summarization, and text generation have demonstrated the effectiveness of the SFOT algorithm, yielding improved performance over strong baselines on these tasks.
3.论文名称:Plug, Play Autoencoders for Conditional Text Generation.
作者:Florian Mai, Nikolaos Pappas, Ivan Montero, Noah A. Smith, James Henderson
Conditional text generation encompasses a large number of natural language processing tasks such as text simplification, summarization, machine translation and style transfer.
The authors present Emb2Emb, a framework that reduces conditional text generation tasks to learning in the embedding space of a pretrained autoencoder.
The authors propose an adversarial method and a neural architecture that are crucial for the method’s success by making learning stay on the manifold of the autoencoder.
Since the framework can be used with any pretrained autoencoder, it will benefit from large-scale pretraining in future research.
4.论文名称:CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation
作者:Tianlu Wang, Xuezhi Wang, Yao Qin, Ben Packer, Kang Li, Jilin Chen, Alex Beutel, Ed Chi
It has been shown that NLP models are often sensitive to random initialization, out-of-distribution data, and adversarially generated attacks.
The authors propose a controlled adversarial text generation model that can generate more diverse and fluent adversarial texts.
The authors’ current generation is controlled by a few pre-specified attributes that are label-irrelevant by definition.
One benefit of the framework is that it is flexible enough to incorporate multiple task-irrelevant attributes and the optimization allows the model to figure out which attributes are more susceptible to attacks.
5.论文名称:PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation
作者:Xinyu Hua, Lu Wang
Large pre-trained language models are the cornerstone of many state-of-the-art models in various natural language understanding and generation tasks, yet they are far from perfect.
The authors present a novel content-controlled generation framework that adds content planning to large pretrained Transformers without modifying model architecture.
A BERT-based planning model is first designed to assign and position key phrases into different sentences.
The authors investigate an iterative refinement algorithm that works with the sequence-to-sequence models to improve generation quality with flexible editing.
