VNU-UET Repository

Enhancing the quality of Phrase-table in Statistical Machine Translation for Less-Common and Low-Resource Languages

Nguyen, Minh Thuan and Bui, Van Tan and Vu, Huy Hien and Nguyen, Phuong Thai and Luong, Chi Mai (2018) Enhancing the quality of Phrase-table in Statistical Machine Translation for Less-Common and Low-Resource Languages. In: International Association of Logopedics and Phoniatrics.

[img] PDF - Published Version
Download (233kB)

Abstract

The phrase-table plays an important role in traditional phrase-based statistical machine translation (SMT) system. During translation, a phrase-based SMT system relies heavily on phrase-table to generate outputs. In this paper, we propose two methods for enhancing the quality of phrase-table. The first method is to recompute phrasetable weights by using vector representations similarity. The remaining method is to enrich the phrase-table by integrating new phrase-pairs from an extended dictionary and projections of word vector presentations on the target language space. Our methods produce an attainment of up to 0.21 and 0.44 BLEU scores on in-domain and cross-domain (Asian Language Treebank - ALT) English - Vietnamese datasets respectively.

Item Type: Conference or Workshop Item (Paper)
Subjects: Information Technology (IT)
Divisions: Faculty of Information Technology (FIT)
Depositing User: Ngy�n Phương Thái
Date Deposited: 09 Jan 2019 13:56
Last Modified: 09 Jan 2019 13:56
URI: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3413

Actions (login required)

View Item View Item