eprintid: 70 rev_number: 10 eprint_status: archive userid: 15 dir: disk0/00/00/00/70 datestamp: 2017-06-16 03:48:26 lastmod: 2017-06-16 03:48:26 status_changed: 2017-06-16 03:48:26 type: conference_item metadata_visibility: show creators_name: Le, Van Dat creators_name: Dang, Cao Cuong creators_name: Le, Si Quang creators_name: Le, Sy Vinh creators_id: vinhls@vnu.edu.vn title: A Fast and Efficient Method for Estimating Amino Acid Substitution Models ispublished: pub subjects: IT divisions: fac_fit keywords: amino acid substitution models, biology computing, evolution (biological), evolutionary information, genetics, maximum likelihood approaches, maximum likelihood estimation, phylogenetics trees, protein phylogenetics analysis, protein sequence alignment, Proteins, trees (mathematics) abstract: Amino acid substitution models (matrices) play important role for protein phylogenetics analysis and protein sequence alignment. Different approaches have been proposed to estimate amino acid substitution matrices since the time of Day Hoff in 1972. Currently, maximum likelihood approaches have been widely used to estimate popular matrices such as WAG, LG, FLU, etc. Although maximum likelihood approaches result in high quality matrices, they are slow and not applicable to very large datasets. The most time consuming step in estimating matrices is building phylogenetics trees from protein alignments. In this paper, we propose new methods to overcome the obstacle by splitting large alignments into small ones which still contain enough evolutionary information for estimating matrices. Experiments with both Pfam and FLU datasets showed that proposed methods were about three to nine times faster than the best current method while the quality of estimated matrices are nearly the same. Thus, our methods will enable researchers to estimate matrices from very large datasets. date: 2011-10 date_type: published full_text_status: none pres_type: speech pagerange: 85 -91 event_title: 2011 Third International Conference on Knowledge and Systems Engineering (KSE) event_dates: 2011 event_type: conference refereed: TRUE citation: Le, Van Dat and Dang, Cao Cuong and Le, Si Quang and Le, Sy Vinh (2011) A Fast and Efficient Method for Estimating Amino Acid Substitution Models. In: 2011 Third International Conference on Knowledge and Systems Engineering (KSE), 2011.