EMNLP2020 | 近期必读Text Generation精选论文

AMiner平台由清华大学计算机系研发,拥有我国完全自主知识产权。平台包含了超过2.3亿学术论文/专利和1.36亿学者的科技图谱,提供学者评价、专家发现、智能指派、学术地图等科技情报专业化服务。系统2006年上线,吸引了全球220个国家/地区1000多万独立IP访问,数据下载量230万次,年度访问量超过1100万,成为学术搜索和社会网络挖掘研究的重要数据和实验平台。

AMiner平台:https://www.aminer.cn

导语:文本生成(Text Generation)是自然语言处理的一部分,从知识库或逻辑形式等等机器表述系统去生成自然语言。
文本生成系统可以说是一种将资料转换成自然语言表述的翻译器。不过产生最终语言的方法不同于编译程式,因为自然语言多样的表达。文本生成可以视为自然语言理解的反向:自然语言理解系统须要厘清输入句的意涵,从而产生机器表述语言;文本生成系统须要决定如何把概念转化成语言。
目前,有限制条件或者有倾向性的文本生成是学者们最为关注的问题。
根据AMiner-EMNLP2020词云图和论文可以看出,Text Generation在本次会议中也有许多不凡的工作,下面我们一起看看Text Generation主题的相关论文。
EMNLP2020 | 近期必读Text Generation精选论文

1.论文名称:POINTER: Constrained Text Generation via Insertion-based Generative Pre-training
论文链接:https://www.aminer.cn/pub/5eb78919da5629cf24430377?conf=emnlp2020
作者:Zhang Yizhe, Wang Guoyin, Li Chunyuan, Gan Zhe, Brockett Chris, Dolan Bill
简介:
Real-world editorial assistant applications must often generate text under specified lexical constraints, for example, convert a meeting note with key phrases into a concrete meeting summary, recast a user-input search query as a fluent sentence, generate a conversational response using grounding facts, or create a story using a pre-specified set of keywords.
The authors have presented POINTER, a simple yet powerful approach to generating text from a given set of lexical constraints in a non-autoregressive manner.
The proposed method leverages a large-scale pre-trained model to generate text in a progressive manner using an insertion-based Transformer.
Both automatic and human evaluation demonstrate the effectiveness of POINTER and its potential in constrained text generation.
EMNLP2020 | 近期必读Text Generation精选论文

2.论文名称:Improving Text Generation with Student-Forcing Optimal Transport.
论文链接:https://www.aminer.cn/pub/5f7fe6d80205f07f68973220?conf=emnlp2020
作者:Jianqiao Li, Chunyuan Li, Guoyin Wang, Hao Fu, Yuhchen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang, Dinghan Shen, Qian Yang
简介:
Natural language generation is an essential component of many NLP applications, such as machine translation, image captioning, text summarization, dialogue systems, and machine comprehension.
Generating human-like natural language is typically cast as predicting a sequence of consecutive words in a recurrent manner.
The authors have introduced SFOT to mitigate exposure bias in text generation. The proposed model captures positional and contextual information of word tokens in OT matching.
Experiments on neural machine translation, text summarization, and text generation have demonstrated the effectiveness of the SFOT algorithm, yielding improved performance over strong baselines on these tasks.
EMNLP2020 | 近期必读Text Generation精选论文

3.论文名称:Plug, Play Autoencoders for Conditional Text Generation.
论文链接:https://www.aminer.cn/pub/5f7fe6d80205f07f689732b3?conf=emnlp2020
作者:Florian Mai, Nikolaos Pappas, Ivan Montero, Noah A. Smith, James Henderson
简介:
Conditional text generation encompasses a large number of natural language processing tasks such as text simplification, summarization, machine translation and style transfer.
The authors present Emb2Emb, a framework that reduces conditional text generation tasks to learning in the embedding space of a pretrained autoencoder.
The authors propose an adversarial method and a neural architecture that are crucial for the method’s success by making learning stay on the manifold of the autoencoder.
Since the framework can be used with any pretrained autoencoder, it will benefit from large-scale pretraining in future research.
EMNLP2020 | 近期必读Text Generation精选论文

4.论文名称:CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation
论文链接:https://www.aminer.cn/pub/5f7d8b6091e011346ad27d3b?conf=emnlp2020
作者:Tianlu Wang, Xuezhi Wang, Yao Qin, Ben Packer, Kang Li, Jilin Chen, Alex Beutel, Ed Chi
简介:
It has been shown that NLP models are often sensitive to random initialization, out-of-distribution data, and adversarially generated attacks.
The authors propose a controlled adversarial text generation model that can generate more diverse and fluent adversarial texts.
The authors’ current generation is controlled by a few pre-specified attributes that are label-irrelevant by definition.
One benefit of the framework is that it is flexible enough to incorporate multiple task-irrelevant attributes and the optimization allows the model to figure out which attributes are more susceptible to attacks.
EMNLP2020 | 近期必读Text Generation精选论文

5.论文名称:PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation
论文链接:https://www.aminer.cn/pub/5f7d893591e011346ad27d16?conf=emnlp2020
作者:Xinyu Hua, Lu Wang
简介:
Large pre-trained language models are the cornerstone of many state-of-the-art models in various natural language understanding and generation tasks, yet they are far from perfect.
The authors present a novel content-controlled generation framework that adds content planning to large pretrained Transformers without modifying model architecture.
A BERT-based planning model is first designed to assign and position key phrases into different sentences.
The authors investigate an iterative refinement algorithm that works with the sequence-to-sequence models to improve generation quality with flexible editing.
EMNLP2020 | 近期必读Text Generation精选论文

想要了解更多EMNLP2020论文,可以关注公众号或者链接直达EMNLP2020专题,最前沿的研究方向和最全面的论文数据等你来~
EMNLP2020 | 近期必读Text Generation精选论文