VNU-UET Repository

Using Content-Based Features for Author Profiling of Vietnamese Forum Posts

Tran Duc Duong and Bao Son Pham and Hanh Tan (2016) Using Content-Based Features for Author Profiling of Vietnamese Forum Posts. In: Recent Developments in Intelligent Information and Database Systems. Springer International Publishing, pp. 287-296.

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1007/978-3-319-31277-4_25

Abstract

This paper reports the results of author profiling task for Vietnamese forum posts to identify the personal traits, such as gender, age, occupation, and location of the author using content-based features. Experiments were conducted on the different types of features, including stylometric features (such as lexical, syntactic, structural features) as well as content-based features (the most important words) to compare the performance and on the data sets we collected from the various forums in Vietnamese. Three learning methods, consisting of Decision Tree, Bayes Network, Support Vector Machine (SVM), were tested and the SVM achieved the best results. The results show that these kinds of features work well on such a kind of short and free style messages as forum posts, in which, content-based features yielded much better results than stylometric features.

Item Type:Book Section
Subjects:Information Technology (IT)
Divisions:Faculty of Information Technology (FIT)
ID Code:2257
Deposited By: Phạm Bảo Sơn
Deposited On:16 Dec 2016 05:13
Last Modified:16 Dec 2016 05:13

Repository Staff Only: item control page