TY  - CONF
ID  - SisLab1534
UR  - https://eprints.uet.vnu.edu.vn/eprints/id/eprint/1534/
A1  - Nguyen, Ngoc Khuong
A1  - Pham, Duc Hong
A1  - Le, Anh Cuong
A1  - Pham, Hong Thai
Y1  - 2016/03/26/
N2  - As well know, part-of-Speech (POS) tagging is basic and central problem of natural language processing branch. Development of a POS tagger will influence several pipelined modules of natural language understanding system including information extraction and retrieval; machine translation; partial parsing and word sense disambiguation. In recent years, there has been a growing interest in data-driven machine learning disambiguation methods for POS tagging with results be very close state-of-art of this problem. Improvement performance of these methods has been posed challenges for researchers. In this paper, we use one of the best machine learning method (Maximum Entropy Model - MEM) to do automatic annotation of part-of-speech in basedline then improve accuracy by using Transformation Based Learning (TBL). Our approach based on an incremental knowledge acquisition method where rules are stored in an exception structure and new rules are only selected to correct the errors are positive on courpus. Experimental results on English Peen Treebank show that our method greatly improve accuracy than the naive baseline model with state-of-the-art accuracy(97.14%). Special, we also do experiments on Vietnamese Viet TreeBank corpus and experimental results show that the proposed Vietnamese POS tagging system outperforms the other state-of-the-art Vietnamese taggers with 93.50% overall accuracy.
TI  - Improving Maximum Entropy Part-of-Speech Tagger with Transformation Based Learning Model
M2  - Hanoi
AV  - public
T2  - SW4PHD: the 2016 Scientific Workshop for PhD Students
ER  -