eprintid: 2912 rev_number: 15 eprint_status: archive userid: 378 dir: disk0/00/00/29/12 datestamp: 2018-01-12 02:00:47 lastmod: 2018-01-12 02:00:58 status_changed: 2018-01-12 02:00:58 type: monograph metadata_visibility: show creators_name: Nguyen, Hong-Thinh creators_id: hongthinh.nguyen@vnu.edu.vn title: RNN on Machine Reading Comprehension Bi-Directional Attention Flow model ispublished: pub subjects: ECE divisions: fac_fet keywords: RNN, Natural Language Processing abstract: Although end-to-end deep neural network have gained popularity in the last few years and have been successful in several Natural Language Processing tasks, reading comprehension remains a challenging one. In this report, we presents in details the popular Bi-Directional Attention Flow model which represents the context at different level and combined the context-to-query and query-to-context direction attention. All necessary background knowledge of general Recurrent Neural Network is also discussed. date: 2017-12-15 date_type: completed publisher: University of Engineering and Technology contact_email: hongthinh.nguyen@vnu.edu.vn full_text_status: public monograph_type: technical_report place_of_pub: University of Engineering and Technology pages: 17 institution: Signal and System Laboratory department: Faculty of Electrical Engineering referencetext: [1] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural Machine Translation by Jointly Learning to Align and Translate”. In: CoRR abs/1409.0473 (2014). arXiv: 1409.0473. url: http://arxiv.org/abs/1409.0473. [2] Yoshua Bengio, Patrice Simard, and Paolo Frasconi. “Learning long-term de- pendencies with gradient descent is difficult”. In: IEEE transactions on neural networks 5.2 (1994), pp. 157–166. [3] Kyunghyun Cho et al. “Learning Phrase Representations using RNN Encoder- Decoder for Statistical Machine Translation”. In: CoRR abs/1406.1078 (2014). arXiv: 1406.1078. url: http://arxiv.org/abs/1406.1078. [4] Junyoung Chung et al. “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling”. In: CoRR abs/1412.3555 (2014). arXiv: 1412.3555. url: http://arxiv.org/abs/1412.3555. [5] Sepp Hochreiter and Jürgen Schmidhuber. “Long short-term memory”. In: Neural computation 9.8 (1997), pp. 1735–1780. [6] Xuezhe Ma and Eduard H. Hovy. “End-to-end Sequence Labeling via Bi- directional LSTM-CNNs-CRF”. In: CoRR abs/1603.01354 (2016). arXiv: 1603. 01354. url: http://arxiv.org/abs/1603.01354. [7] Jeffrey Pennington, Richard Socher, and Christopher D. Manning. “GloVe: Global Vectors for Word Representation”. In: Empirical Methods in Natural Language Processing (EMNLP). 2014, pp. 1532–1543. url: http://www.aclweb. org/anthology/D14-1162. [8] Pranav Rajpurkar et al. “SQuAD: 100, 000+ Questions for Machine Compre- hension of Text”. In: CoRR abs/1606.05250 (2016). arXiv: 1606.05250. url: http://arxiv.org/abs/1606.05250. [9] Min Joon Seo et al. “Bidirectional Attention Flow for Machine Comprehen- sion”. In: CoRR abs/1611.01603 (2016). arXiv: 1611.01603. url: http:// arxiv.org/abs/1611.01603. [10] Rupesh Kumar Srivastava, Klaus Greff, and Jürgen Schmidhuber. “Highway networks”. In: arXiv preprint arXiv:1505.00387 (2015). citation: Nguyen, Hong-Thinh (2017) RNN on Machine Reading Comprehension Bi-Directional Attention Flow model. Technical Report. University of Engineering and Technology, University of Engineering and Technology. document_url: https://eprints.uet.vnu.edu.vn/eprints/id/eprint/2912/1/technical%20report.pdf