TY - JOUR ID - SisLab4204 UR - https://eprints.uet.vnu.edu.vn/eprints/id/eprint/4204/ IS - 1 A1 - Pham, Thi Quynh Trang A1 - Bui, Manh Thang A1 - Dang, Thanh Hai Y1 - 2020/05// N2 - Chemical compounds (drugs) and diseases are among top searched keywords on the PubMed database of biomedical literature by biomedical researchers all over the world (according to a study in 2009). Working with PubMed is essential for researchers to get insights into drugs? side effects (chemical-induced disease relations (CDR), which is essential for drug safety and toxicity. It is, however, a catastrophic burden for them as PubMed is a huge database of unstructured texts, growing steadily very fast (~28 millions scientific articles currently, approximately two deposited per minute). As a result, biomedical text mining has been empirically demonstrated its great implications in biomedical research communities. Biomedical text has its own distinct challenging properties, attracting much attetion from natural language processing communities. A large-scale study recently in 2018 showed that incorporating information into indenpendent multiple-input layers outperforms concatenating them into a single input layer (for biLSTM), producing better performance when compared to state-of-the-art CDR classifying models. This paper demonstrates that for a CNN it is vice-versa, in which concatenation is better for CDR classification. To this end, we develop a CNN based model with multiple input concatenated for CDR classification. Experimental results on the benchmark dataset demonstrate its outperformance over other recent state-of-the-art CDR classification models. PB - VNU JF - VNU Journal of Science: Computer Science and Communication Engineering VL - 36 SN - 2588-1086 TI - Single Concatenated Input is Better than Indenpendent Multiple-input for CNNs to Predict Chemical-induced Disease Relation from Literature SP - 11 AV - none EP - 16 ER -