relation: https://eprints.uet.vnu.edu.vn/eprints/id/eprint/4077/ title: On Rectifying the Mapping between Articles and Institutions in Bibliometric Databases creator: Ngo, Kien Tuan creator: Vo, Dinh Hieu creator: Bui, Ngoc Thang creator: Pham, Le Viet Anh creator: Pham, Khanh Ly creator: Phan, Hai subject: Information Technology (IT) description: Today, bibliometric databases are indispensable sources for researchers and research institutions. The main role of these databases is to find research articles and estimate the performance of researchers and institutions. Regarding the evaluation of the research performance of an organization, the accuracy in determining institutions of authors of articles is decisive. However, current popular bibliometric databases such as Scopus and Web of Science have not addressed this point efficiently. To this end, we propose an approach to revise the authors’ affiliation information of articles in bibliometric databases. We build a model to classify articles to institutions with high accuracy by assembling the bag of words and n-grams techniques for extracting features of affiliation strings. After that, these features are weighted to determine their importance to each institution. Affiliation strings of articles are transformed into the new feature space by integrating weights of features and local characteristics of words and phrases contributing to the sequences. Finally, on the feature space, the support vector classifier method is applied to learn a predictive model. Our experimental result shows that the proposed model’s accuracy is about 99.1%. publisher: VNU date: 2020 type: Article type: PeerReviewed identifier: Ngo, Kien Tuan and Vo, Dinh Hieu and Bui, Ngoc Thang and Pham, Le Viet Anh and Pham, Khanh Ly and Phan, Hai (2020) On Rectifying the Mapping between Articles and Institutions in Bibliometric Databases. VNU Journal of Science: Computer Science and Communication Engineering . ISSN 2588-1086