VNU-UET Repository

Statistical Machine Translation For Vietnamese Grammatical Error Correction

Nguyen Binh, Nguyen and Nguyen Van, Vinh (2018) Statistical Machine Translation For Vietnamese Grammatical Error Correction. Technical Report. Conference. (Unpublished)

[img] PDF
Download (628kB)


Nowadays, along with the development of Natural Language Processing, there are a lot of research which use Statistical Machine Translation for grammatical error correction. Despite the fact that, there are a few researches which can be applied to Vietnamese. As a result, our purpose is to implement grammatical error correction in Vietnamese. The problem can easily describe like this: you have a wrong sentence as input, after being processed by the model, you will have the right sentence as output. In this research, we focus on applying Statistical Machine Translation to Vietnamese. This is a part of Machine Learning approach in order to solve the grammatical error correction problem. At first, we will try to create a list of all kind of Vietnamese’s error. Then, we aim for correcting simple error, like spelling error, then we develop the system step by step to handle and correct complex error. To do that, the model need lots of data to train, so we collect as much Vietnamese sentences as possible, and turn them into wrong to make parallel data. The data will be divided into three parts, which are used for training, tuning, and testing, respectively. After all, the model achieved some results, where the sentences with spelling mistake is corrected better than others. The result is not too good, but it can be seen that we can apply Statistical Machine Translation for the Grammatical error correction problem.

Item Type: Technical Report (Technical Report)
Subjects: Information Technology (IT)
Divisions: Faculty of Information Technology (FIT)
Depositing User: Nguy�n V
Date Deposited: 20 Dec 2018 01:45
Last Modified: 20 Dec 2018 01:45

Actions (login required)

View Item View Item