eprintid: 4077 rev_number: 8 eprint_status: archive userid: 288 dir: disk0/00/00/40/77 datestamp: 2020-10-09 07:11:16 lastmod: 2020-10-09 07:11:16 status_changed: 2020-10-09 07:11:16 type: article metadata_visibility: show creators_name: Ngo, Kien Tuan creators_name: Vo, Dinh Hieu creators_name: Bui, Ngoc Thang creators_name: Pham, Le Viet Anh creators_name: Pham, Khanh Ly creators_name: Phan, Hai creators_id: hieuvd@vnu.edu.vn creators_id: thangbn@vnu.edu.vn title: On Rectifying the Mapping between Articles and Institutions in Bibliometric Databases ispublished: pub subjects: IT divisions: fac_fit abstract: Today, bibliometric databases are indispensable sources for researchers and research institutions. The main role of these databases is to find research articles and estimate the performance of researchers and institutions. Regarding the evaluation of the research performance of an organization, the accuracy in determining institutions of authors of articles is decisive. However, current popular bibliometric databases such as Scopus and Web of Science have not addressed this point efficiently. To this end, we propose an approach to revise the authors’ affiliation information of articles in bibliometric databases. We build a model to classify articles to institutions with high accuracy by assembling the bag of words and n-grams techniques for extracting features of affiliation strings. After that, these features are weighted to determine their importance to each institution. Affiliation strings of articles are transformed into the new feature space by integrating weights of features and local characteristics of words and phrases contributing to the sequences. Finally, on the feature space, the support vector classifier method is applied to learn a predictive model. Our experimental result shows that the proposed model’s accuracy is about 99.1%. date: 2020 publisher: VNU full_text_status: none publication: VNU Journal of Science: Computer Science and Communication Engineering refereed: TRUE issn: 2588-1086 citation: Ngo, Kien Tuan and Vo, Dinh Hieu and Bui, Ngoc Thang and Pham, Le Viet Anh and Pham, Khanh Ly and Phan, Hai (2020) On Rectifying the Mapping between Articles and Institutions in Bibliometric Databases. VNU Journal of Science: Computer Science and Communication Engineering . ISSN 2588-1086