eprintid: 4204 rev_number: 6 eprint_status: archive userid: 327 dir: disk0/00/00/42/04 datestamp: 2020-12-09 03:20:02 lastmod: 2020-12-09 03:20:02 status_changed: 2020-12-09 03:20:02 type: article metadata_visibility: show creators_name: Pham, Thi Quynh Trang creators_name: Bui, Manh Thang creators_name: Dang, Thanh Hai creators_id: trangptq@vnu.edu.vn creators_id: hai.dang@vnu.edu.vn title: Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature ispublished: pub subjects: IT divisions: fac_fit abstract: Chemical compounds (drugs) and diseases are among top searched keywords on the PubMed database of biomedical literature by biomedical researchers all over the world (according to a study in 2009). Working with PubMed is essential for researchers to get insights into drugs’ side effects (chemical-induced disease relations (CDR), which is essential for drug safety and toxicity. It is, however, a catastrophic burden for them as PubMed is a huge database of unstructured texts, growing steadily very fast (~28 millions scientific articles currently, approximately two deposited per minute). As a result, biomedical text mining has been empirically demonstrated its great implications in biomedical research communities. Biomedical text has its own distinct challenging properties, attracting much attetion from natural language processing communities. A large-scale study recently in 2018 showed that incorporating information into indenpendent multiple-input layers outperforms concatenating them into a single input layer (for biLSTM), producing better performance when compared to state-of-the-art CDR classifying models. This paper demonstrates that for a CNN it is vice-versa, in which concatenation is better for CDR classification. To this end, we develop a CNN based model with multiple input concatenated for CDR classification. Experimental results on the benchmark dataset demonstrate its outperformance over other recent state-of-the-art CDR classification models. date: 2020-05 date_type: published publisher: VNU full_text_status: none publication: VNU Journal of Science: Computer Science and Communication Engineering volume: 36 number: 1 pagerange: 11-16 refereed: TRUE issn: 2588-1086 related_url_url: https://jcsce.vnu.edu.vn/index.php/jcsce/article/view/237 citation: Pham, Thi Quynh Trang and Bui, Manh Thang and Dang, Thanh Hai (2020) Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature. VNU Journal of Science: Computer Science and Communication Engineering, 36 (1). pp. 11-16. ISSN 2588-1086