Pham, Thi Ngan and Nguyen, Van Quang and Tran, Van Hien and Nguyen, Tri Thanh and Ha, Quang Thuy (2017) A semi-supervised multi-label classification framework with feature reduction and enrichment. Journal of Information and Telecommunication .
Full text not available from this repository.Abstract
Multi-label classification has drawn much attention thanks to its usefulness and omnipresence in real-world applications in which objects may be char-acterized by more than one label as in the traditional approach. Getting mul-ti-label examples is costly and time-consuming therefore semi-supervised learning approach should be considered to take advantages of both labeled and unlabeled data. In this work, we propose a semi-supervised multi-label classification algorithm exploiting the specific features of the prominent class label(s) chosen by a greedy approach as an extension of the LIFT algo-rithm, and unlabeled data consumption mechanism from the TESC algo-rithm. We also make a semi-supervised multi-label classification application framework for Vietnamese texts with several feature enrichment steps in-cluding a) a stage of enriching features by adding hidden topic features; b) a stage of dimensional reduction for subtracting irrelevant features. Experi-mental results on a dataset of hotel reviews (for tourism) indicate that a rea-sonable amount of unlabeled data helps to increase the F1 score. Interesting-ly, with a small amount of labeled data, our algorithm can reach a compara-tive performance to the case of using a larger amount of labeled data.
Item Type: | Article |
---|---|
Subjects: | Information Technology (IT) |
Divisions: | Faculty of Information Technology (FIT) |
Depositing User: | Ass. Prof. Tri-Thanh NGUYEN |
Date Deposited: | 14 Jun 2017 08:59 |
Last Modified: | 14 Jun 2017 08:59 |
URI: | http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2504 |
Actions (login required)
View Item |