Title |
Ultra-large alignments using phylogeny-aware profiles
|
---|---|
Published in |
Genome Biology, June 2015
|
DOI | 10.1186/s13059-015-0688-z |
Pubmed ID | |
Authors |
Nam-phuong D. Nguyen, Siavash Mirarab, Keerthana Kumar, Tandy Warnow |
Abstract |
Many biological questions, including the estimation of deep evolutionary histories and the detection of remote homology between protein sequences, rely upon multiple sequence alignments and phylogenetic trees of large datasets. However, accurate large-scale multiple sequence alignment is very difficult, especially when the dataset contains fragmentary sequences. We present UPP, a multiple sequence alignment method that uses a new machine learning technique - the Ensemble of Hidden Markov Models - that we propose here. UPP produces highly accurate alignments for both nucleotide and amino acid sequences, even on ultra-large datasets or datasets containing fragmentary sequences. UPP is available at https://github.com/smirarab/sepp . |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 7 | 25% |
France | 2 | 7% |
United Kingdom | 2 | 7% |
Uruguay | 1 | 4% |
Canada | 1 | 4% |
China | 1 | 4% |
Germany | 1 | 4% |
Netherlands | 1 | 4% |
Norway | 1 | 4% |
Other | 0 | 0% |
Unknown | 11 | 39% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 14 | 50% |
Scientists | 13 | 46% |
Science communicators (journalists, bloggers, editors) | 1 | 4% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 6 | 5% |
Brazil | 2 | 2% |
Netherlands | 1 | <1% |
New Zealand | 1 | <1% |
Unknown | 117 | 92% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 30 | 24% |
Researcher | 27 | 21% |
Student > Master | 12 | 9% |
Student > Bachelor | 9 | 7% |
Professor | 8 | 6% |
Other | 18 | 14% |
Unknown | 23 | 18% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 51 | 40% |
Biochemistry, Genetics and Molecular Biology | 23 | 18% |
Computer Science | 15 | 12% |
Immunology and Microbiology | 2 | 2% |
Chemistry | 2 | 2% |
Other | 6 | 5% |
Unknown | 28 | 22% |