Target-Side Data Augmentation for Sequence Generation


Please cite:
@inproceedings{xie2022targetside, title={Target-Side Data Augmentation for Sequence Generation},
author={Shufang Xie and Ang Lv and Yingce Xia and Lijun Wu and Tao Qin and Rui Yan and Tie-Yan Liu},
booktitle={International Conference on Learning Representations},


Autoregressive sequence generation, a prevalent task in machine learning and natural language processing, generates every target token conditioned on both a source input and previously generated target tokens. Previous data augmentation methods, which have been shown to be effective for the task, mainly enhance source inputs (e.g., injecting noise into the source sequence by random swapping or masking, back translation, etc.) while overlooking the target-side augmentation. In this work, we propose a target-side augmentation method for sequence generation. In training, we use the decoder output probability distributions as soft indicators, which are multiplied with target token embeddings, to build pseudo tokens. These soft pseudo tokens are then used as target tokens to enhance the training. We conduct comprehensive experiments on various sequence generation tasks, including dialog generation, machine translation, and abstractive summarization. Without using any extra labeled data or introducing additional model parameters, our method significantly outperforms strong baselines. The code is available at