Title |
De novo likelihood-based measures for comparing genome assemblies
|
---|---|
Published in |
BMC Research Notes, August 2013
|
DOI | 10.1186/1756-0500-6-334 |
Pubmed ID | |
Authors |
Mohammadreza Ghodsi, Christopher M Hill, Irina Astrovskaya, Henry Lin, Dan D Sommer, Sergey Koren, Mihai Pop |
Abstract |
The current revolution in genomics has been made possible by software tools called genome assemblers, which stitch together DNA fragments "read" by sequencing machines into complete or nearly complete genome sequences. Despite decades of research in this field and the development of dozens of genome assemblers, assessing and comparing the quality of assembled genome sequences still relies on the availability of independently determined standards, such as manually curated genome sequences, or independently produced mapping data. These "gold standards" can be expensive to produce and may only cover a small fraction of the genome, which limits their applicability to newly generated genome sequences. Here we introduce a de novo probabilistic measure of assembly quality which allows for an objective comparison of multiple assemblies generated from the same set of reads. We define the quality of a sequence produced by an assembler as the conditional probability of observing the sequenced reads from the assembled sequence. A key property of our metric is that the true genome sequence maximizes the score, unlike other commonly used metrics. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 10 | 31% |
United Kingdom | 4 | 13% |
Sweden | 1 | 3% |
Cameroon | 1 | 3% |
Germany | 1 | 3% |
Norway | 1 | 3% |
Chile | 1 | 3% |
Mexico | 1 | 3% |
Brazil | 1 | 3% |
Other | 2 | 6% |
Unknown | 9 | 28% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Scientists | 19 | 59% |
Members of the public | 12 | 38% |
Science communicators (journalists, bloggers, editors) | 1 | 3% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 6 | 5% |
Norway | 2 | 2% |
Germany | 2 | 2% |
Italy | 1 | <1% |
Brazil | 1 | <1% |
United Kingdom | 1 | <1% |
Sweden | 1 | <1% |
Japan | 1 | <1% |
New Zealand | 1 | <1% |
Other | 0 | 0% |
Unknown | 102 | 86% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 36 | 31% |
Student > Ph. D. Student | 29 | 25% |
Student > Master | 13 | 11% |
Student > Bachelor | 8 | 7% |
Student > Postgraduate | 7 | 6% |
Other | 18 | 15% |
Unknown | 7 | 6% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 63 | 53% |
Computer Science | 23 | 19% |
Biochemistry, Genetics and Molecular Biology | 15 | 13% |
Mathematics | 3 | 3% |
Earth and Planetary Sciences | 2 | 2% |
Other | 3 | 3% |
Unknown | 9 | 8% |