Le, Hoang Quynh and Tran, Mai Vu and Dang, Thanh Hai and Ha, Quang Thuy and Collier, Nigel (2016) Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction. Database, 2016 . baw102. ISSN 1758-0463
There is a more recent version of this item available. |
PDF
Download (806kB) |
Abstract
The BioCreative V chemical-disease relation (CDR) track was proposed to accelerate progress of text mining in facilitating integrative understanding ofchemical substances, diseases and their relations. In this article, we describe an extension of the UET-CAM system for mining chemical-disease relations from text data, of which performance was ranked 4th among 18 participating corresponding systems by the BioCreative CDR track committee. In Disease Named Entity Recognition and Normalization (DNER) phase, our system employs joint learning with a perceptron-based named entity recognizer (NER) and a back-off model with Semantic Supervised Indexing (SSI) and Skip-gram for named entity normalization (NEN). Crucially, for solving the chemical-induced disease (CID) sub-task, we propose a pipeline that includes a coreference resolution module and a SVM intra-sentence relations extraction model. The former module utilizes a multi-pass sieve to identify inter-sentence references for entities while the latter is trained on both the CDR data and our silverCID corpus with a rich feature set. SilverCID is the silver standard corpus contains more than 50 thousands sentences which are automatically built based on the CTD database in order to provide evidence for the CID relation extraction. We critically evaluated our method on the CDR test set in order to clarify the contribution of our system components. Results show an F1 of 82.44 for the DNER task, and a best performance of F1 58.90 on the CID task. The comparisons also demonstrate the significant contribution of the multi-pass sieve coreference resolution method and the silverCID corpus.
Item Type: | Article |
---|---|
Subjects: | Information Technology (IT) ISI-indexed journals |
Divisions: | Faculty of Information Technology (FIT) |
Depositing User: | Hà Quang Thụy |
Date Deposited: | 07 Dec 2016 08:24 |
Last Modified: | 07 Dec 2016 08:26 |
URI: | http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2001 |
Available Versions of this Item
- Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction. (deposited 07 Dec 2016 08:24) [Currently Displayed]
Actions (login required)
View Item |