eprintid: 3170 rev_number: 12 eprint_status: archive userid: 370 dir: disk0/00/00/31/70 datestamp: 2018-12-12 06:35:47 lastmod: 2018-12-12 06:35:47 status_changed: 2018-12-12 06:35:47 type: conference_item metadata_visibility: show creators_name: Ho, Thi Nga creators_name: Can, Duy Cat creators_name: Chng, Eng Siong creators_id: ngaht@ntu.edu.sg creators_id: catcd@vnu.edu.vn creators_id: ASESChng@ntu.edu.sg title: An Investigation of Word Embeddings with Deep Bidirectional LSTM for Sentence Unit Detection in Automatic Speech Transcription ispublished: inpress subjects: IT divisions: fac_fit abstract: This work investigates the effectiveness of using the word based and sub-word based embedding representations as input for a deep bidirectional Long Short-Term Memory Network for Sentence Unit Detection in Automatic Speech Recognition transcription. Our experimental results show that using sub-word based embedding can significantly improve the SUD performance when a limited text is used to train both the word embedding and the SUD model. The SUD model using the sub-word based embedding gains up to 2.07% absolute improvement in F1-score as compared to the best model trained with the word-based embedding. When tested on a domain-mismatch condition, the SUD model with sub-word based embedding trained from the in-domain data gives an approximate 2% and 1% improvement over the best model using out-of-domain embedding with reference and ASR transcription with 29.5% Word Error Rate respectively. date: 2018-11 date_type: published contact_email: catcd@vnu.edu.vn full_text_status: restricted pres_type: paper event_title: International Conference on Asian Language Processing (IALP 2018) event_location: Bandung, Indonesia event_dates: 15-18 November, 2018 event_type: conference refereed: TRUE citation: Ho, Thi Nga and Can, Duy Cat and Chng, Eng Siong (2018) An Investigation of Word Embeddings with Deep Bidirectional LSTM for Sentence Unit Detection in Automatic Speech Transcription. In: International Conference on Asian Language Processing (IALP 2018), 15-18 November, 2018, Bandung, Indonesia. (In Press) document_url: https://eprints.uet.vnu.edu.vn/eprints/id/eprint/3170/1/paper92.pdf