eprintid: 2257 rev_number: 10 eprint_status: archive userid: 339 dir: disk0/00/00/22/57 datestamp: 2016-12-16 05:13:10 lastmod: 2016-12-16 05:13:10 status_changed: 2016-12-16 05:13:10 type: book_section metadata_visibility: show creators_name: Duong, Tran Duc creators_name: Pham, Bao Son creators_name: Tan, Hanh creators_id: sonpb@vnu.edu.vn title: Using Content-Based Features for Author Profiling of Vietnamese Forum Posts ispublished: pub subjects: IT divisions: fac_fit abstract: This paper reports the results of author profiling task for Vietnamese forum posts to identify the personal traits, such as gender, age, occupation, and location of the author using content-based features. Experiments were conducted on the different types of features, including stylometric features (such as lexical, syntactic, structural features) as well as content-based features (the most important words) to compare the performance and on the data sets we collected from the various forums in Vietnamese. Three learning methods, consisting of Decision Tree, Bayes Network, Support Vector Machine (SVM), were tested and the SVM achieved the best results. The results show that these kinds of features work well on such a kind of short and free style messages as forum posts, in which, content-based features yielded much better results than stylometric features. date: 2016-02-27 date_type: published publisher: Springer International Publishing official_url: http://dx.doi.org/10.1007/978-3-319-31277-4_25 full_text_status: none publication: Recent Developments in Intelligent Information and Database Systems volume: 642 pagerange: 287-296 refereed: TRUE book_title: Recent Developments in Intelligent Information and Database Systems citation: Duong, Tran Duc and Pham, Bao Son and Tan, Hanh (2016) Using Content-Based Features for Author Profiling of Vietnamese Forum Posts. In: Recent Developments in Intelligent Information and Database Systems. Springer International Publishing, pp. 287-296.