TY - JOUR ID - SisLab4604 UR - https://aclanthology.org/2020.paclic-1.24 A1 - Nguyen, Minh Thuan A1 - Nguyen, Phuong Thai A1 - Nguyen, Van Vinh A1 - Nguyen Hoang, Minh Cong Y1 - 2020/10// N2 - Research on providing machine translation systems for unseen language pairs is gaining increasing attention in recent years. However, the quality of their systems is poor for most language pairs, especially for less-common pairs such as Khmer-Vietnamese. In this paper, we show a simple iterative traininggenerating-filtering-training process that utilizes all available pivot parallel data to generate synthetic data for unseen directions. In addition, we propose a filtering method based on word alignments and the longest parallel phrase to filter out noise sentence pairs in the synthetic data. Experiment results on zero-shot Khmer?Vietnamese and Indonesian?Vietnamese directions show that our proposed model outperforms some strong baselines and achieves a promising result under the zero-resource condition on ALT benchmarks. Besides, the results also indicate that our model can easily improve their quality with a small amount of real parallel data. PB - Association for Computational Linguistics JF - Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation TI - Iterative Multilingual Neural Machine Translation for Less-Common and Zero-Resource Language Pairs SP - 207 AV - public EP - 215 ER -