%0 Journal Article %@ 2588-1086 %A Ngo, Kien Tuan %A Vo, Dinh Hieu %A Bui, Ngoc Thang %A Pham, Le Viet Anh %A Pham, Khanh Ly %A Phan, Hai %D 2020 %F SisLab:4077 %I VNU %J VNU Journal of Science: Computer Science and Communication Engineering %T On Rectifying the Mapping between Articles and Institutions in Bibliometric Databases %U https://eprints.uet.vnu.edu.vn/eprints/id/eprint/4077/ %X Today, bibliometric databases are indispensable sources for researchers and research institutions. The main role of these databases is to find research articles and estimate the performance of researchers and institutions. Regarding the evaluation of the research performance of an organization, the accuracy in determining institutions of authors of articles is decisive. However, current popular bibliometric databases such as Scopus and Web of Science have not addressed this point efficiently. To this end, we propose an approach to revise the authors’ affiliation information of articles in bibliometric databases. We build a model to classify articles to institutions with high accuracy by assembling the bag of words and n-grams techniques for extracting features of affiliation strings. After that, these features are weighted to determine their importance to each institution. Affiliation strings of articles are transformed into the new feature space by integrating weights of features and local characteristics of words and phrases contributing to the sequences. Finally, on the feature space, the support vector classifier method is applied to learn a predictive model. Our experimental result shows that the proposed model’s accuracy is about 99.1%.