eprintid: 4533 rev_number: 6 eprint_status: archive userid: 15 dir: disk0/00/00/45/33 datestamp: 2021-06-28 02:33:18 lastmod: 2021-06-28 02:33:18 status_changed: 2021-06-28 02:33:18 type: article succeeds: 4531 metadata_visibility: show creators_name: Minh, Bui Quang creators_name: Dang, Cao Cuong creators_name: Vinh, Le Sy creators_name: Lanfear, Robert creators_id: m.bui@anu.edu.au creators_id: cuongdc@vnu.edu.vn creators_id: vinhls@vnu.edu.vn creators_id: rob.lanfear@anu.edu.au title: QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution ispublished: pub subjects: IT subjects: isi divisions: fac_fit abstract: Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino acid substitution models; however, they are typically complicated and slow. In this article, we propose QMaker, a new ML method to estimate a general time-reversible Q matrix from a large protein data set consisting of multiple sequence alignments. QMaker combines an efficient ML tree search algorithm, a model selection for handling the model heterogeneity among alignments, and the consideration of rate mixture models among sites. We provide QMaker as a user-friendly function in the IQ-TREE software package (http://www.iqtree.org) supporting the use of multiple CPU cores so that biologists can easily estimate amino acid substitution models from their own protein alignments. We used QMaker to estimate new empirical general amino acid substitution models from the current Pfam database as well as five clade-specific models for mammals, birds, insects, yeasts, and plants. Our results show that the new models considerably improve the fit between model and data and in some cases influence the inference of phylogenetic tree topologies.[Amino acid replacement matrices; amino acid substitution models; maximum likelihood estimation; phylogenetic inferences. date: 2021-02-22 date_type: published publisher: Oxford University Press official_url: http://dx.doi.org/10.1093/sysbio/syab010 id_number: 10.1093/sysbio/syab010 full_text_status: public publication: Systematic Biology refereed: TRUE issn: 1063-5157 citation: Minh, Bui Quang and Dang, Cao Cuong and Vinh, Le Sy and Lanfear, Robert (2021) QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution. Systematic Biology . ISSN 1063-5157 document_url: https://eprints.uet.vnu.edu.vn/eprints/id/eprint/4533/1/syab010.pdf