eprintid: 2912
rev_number: 15
eprint_status: archive
userid: 378
dir: disk0/00/00/29/12
datestamp: 2018-01-12 02:00:47
lastmod: 2018-01-12 02:00:58
status_changed: 2018-01-12 02:00:58
type: monograph
metadata_visibility: show
creators_name: Nguyen, Hong-Thinh
creators_id: hongthinh.nguyen@vnu.edu.vn
title: RNN on Machine Reading Comprehension Bi-Directional Attention Flow model
ispublished: pub
subjects: ECE
divisions: fac_fet
keywords: RNN, Natural Language Processing
abstract: Although end-to-end deep neural network have gained popularity in the last few years and have been successful in several Natural Language Processing tasks, reading comprehension remains a challenging one. In this report, we presents in details the popular Bi-Directional Attention Flow model which represents the context at different level and combined the context-to-query and query-to-context direction attention. All necessary background knowledge of general Recurrent Neural Network is also discussed.
date: 2017-12-15
date_type: completed
publisher: University of Engineering and Technology
contact_email: hongthinh.nguyen@vnu.edu.vn
full_text_status: public
monograph_type: technical_report
place_of_pub: University of Engineering and Technology
pages: 17
institution: Signal and System Laboratory
department: Faculty of Electrical Engineering
referencetext: [1] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural Machine Translation by Jointly Learning to Align and Translate”. In: CoRR abs/1409.0473 (2014). arXiv: 1409.0473. url: http://arxiv.org/abs/1409.0473.
[2] Yoshua Bengio, Patrice Simard, and Paolo Frasconi. “Learning long-term de- pendencies with gradient descent is difficult”. In: IEEE transactions on neural networks 5.2 (1994), pp. 157–166.
[3] Kyunghyun Cho et al. “Learning Phrase Representations using RNN Encoder- Decoder for Statistical Machine Translation”. In: CoRR abs/1406.1078 (2014). arXiv: 1406.1078. url: http://arxiv.org/abs/1406.1078.
[4] Junyoung Chung et al. “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling”. In: CoRR abs/1412.3555 (2014). arXiv: 1412.3555. url: http://arxiv.org/abs/1412.3555.
[5] Sepp Hochreiter and Jürgen Schmidhuber. “Long short-term memory”. In: Neural computation 9.8 (1997), pp. 1735–1780.
[6] Xuezhe Ma and Eduard H. Hovy. “End-to-end Sequence Labeling via Bi- directional LSTM-CNNs-CRF”. In: CoRR abs/1603.01354 (2016). arXiv: 1603. 01354. url: http://arxiv.org/abs/1603.01354.
[7] Jeffrey Pennington, Richard Socher, and Christopher D. Manning. “GloVe: Global Vectors for Word Representation”. In: Empirical Methods in Natural Language Processing (EMNLP). 2014, pp. 1532–1543. url: http://www.aclweb. org/anthology/D14-1162.
[8] Pranav Rajpurkar et al. “SQuAD: 100, 000+ Questions for Machine Compre- hension of Text”. In: CoRR abs/1606.05250 (2016). arXiv: 1606.05250. url: http://arxiv.org/abs/1606.05250.
[9] Min Joon Seo et al. “Bidirectional Attention Flow for Machine Comprehen- sion”. In: CoRR abs/1611.01603 (2016). arXiv: 1611.01603. url: http:// arxiv.org/abs/1611.01603.
[10] Rupesh Kumar Srivastava, Klaus Greff, and Jürgen Schmidhuber. “Highway networks”. In: arXiv preprint arXiv:1505.00387 (2015).
citation:   Nguyen, Hong-Thinh  (2017) RNN on Machine Reading Comprehension Bi-Directional Attention Flow model.  Technical Report. University of Engineering and Technology, University of Engineering and Technology.     
document_url: https://eprints.uet.vnu.edu.vn/eprints/id/eprint/2912/1/technical%20report.pdf