VNU-UET Repository

Using Content-Based Features for Author Profiling of Vietnamese Forum Posts

Duong, Tran Duc and Pham, Bao Son and Tan, Hanh (2016) Using Content-Based Features for Author Profiling of Vietnamese Forum Posts. In: Recent Developments in Intelligent Information and Database Systems. Springer International Publishing, pp. 287-296.

Full text not available from this repository.

Abstract

This paper reports the results of author profiling task for Vietnamese forum posts to identify the personal traits, such as gender, age, occupation, and location of the author using content-based features. Experiments were conducted on the different types of features, including stylometric features (such as lexical, syntactic, structural features) as well as content-based features (the most important words) to compare the performance and on the data sets we collected from the various forums in Vietnamese. Three learning methods, consisting of Decision Tree, Bayes Network, Support Vector Machine (SVM), were tested and the SVM achieved the best results. The results show that these kinds of features work well on such a kind of short and free style messages as forum posts, in which, content-based features yielded much better results than stylometric features.

Item Type: Book Section
Subjects: Information Technology (IT)
Divisions: Faculty of Information Technology (FIT)
Depositing User: Phạm Bảo Sơn
Date Deposited: 16 Dec 2016 05:13
Last Modified: 16 Dec 2016 05:13
URI: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2257

Actions (login required)

View Item View Item