Title |
Estimating evolutionary distances between genomic sequences from spaced-word matches
|
---|---|
Published in |
Algorithms for Molecular Biology, February 2015
|
DOI | 10.1186/s13015-015-0032-x |
Pubmed ID | |
Authors |
Burkhard Morgenstern, Bingyao Zhu, Sebastian Horwege, Chris André Leimeister |
Abstract |
Alignment-free methods are increasingly used to calculate evolutionary distances between DNA and protein sequences as a basis of phylogeny reconstruction. Most of these methods, however, use heuristic distance functions that are not based on any explicit model of molecular evolution. Herein, we propose a simple estimator d N of the evolutionary distance between two DNA sequences that is calculated from the number N of (spaced) word matches between them. We show that this distance function is more accurate than other distance measures that are used by alignment-free methods. In addition, we calculate the variance of the normalized number N of (spaced) word matches. We show that the variance of N is smaller for spaced words than for contiguous words, and that the variance is further reduced if our spaced-words approach is used with multiple patterns of 'match positions' and 'don't care positions'. Our software is available online and as downloadable source code at: http://spaced.gobics.de/. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 2 | 100% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 1 | 50% |
Scientists | 1 | 50% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 1 | 2% |
Germany | 1 | 2% |
France | 1 | 2% |
Unknown | 40 | 93% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 20 | 47% |
Researcher | 6 | 14% |
Professor > Associate Professor | 3 | 7% |
Professor | 2 | 5% |
Student > Bachelor | 2 | 5% |
Other | 3 | 7% |
Unknown | 7 | 16% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 18 | 42% |
Biochemistry, Genetics and Molecular Biology | 8 | 19% |
Agricultural and Biological Sciences | 6 | 14% |
Engineering | 2 | 5% |
Psychology | 1 | 2% |
Other | 0 | 0% |
Unknown | 8 | 19% |