↓ Skip to main content

Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches

Overview of attention for article published in BMC Genomics, August 2016
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • Good Attention Score compared to outputs of the same age (73rd percentile)
  • Good Attention Score compared to outputs of the same age and source (79th percentile)

Mentioned by

twitter
9 X users

Readers on

mendeley
157 Mendeley
citeulike
1 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches
Published in
BMC Genomics, August 2016
DOI 10.1186/s12864-016-2895-8
Pubmed ID
Authors

Yesesri Cherukuri, Sarath Chandra Janga

Abstract

Improved DNA sequencing methods have transformed the field of genomics over the last decade. This has become possible due to the development of inexpensive short read sequencing technologies which have now resulted in three generations of sequencing platforms. More recently, a new fourth generation of Nanopore based single molecule sequencing technology, was developed based on MinION(®) sequencer which is portable, inexpensive and fast. It is capable of generating reads of length greater than 100 kb. Though it has many specific advantages, the two major limitations of the MinION reads are high error rates and the need for the development of downstream pipelines. The algorithms for error correction have already emerged, while development of pipelines is still at nascent stage. In this study, we benchmarked available assembler algorithms to find an appropriate framework that can efficiently assemble Nanopore sequenced reads. To address this, we employed genome-scale Nanopore sequenced datasets available for E. coli and yeast genomes respectively. In order to comprehensively evaluate multiple algorithmic frameworks, we included assemblers based on de Bruijn graphs (Velvet and ABySS), Overlap Layout Consensus (OLC) (Celera) and Greedy extension (SSAKE) approaches. We analyzed the quality, accuracy of the assemblies as well as the computational performance of each of the assemblers included in our benchmark. Our analysis unveiled that OLC-based algorithm, Celera, could generate a high quality assembly with ten times higher N50 & mean contig values as well as one-fifth the number of total number of contigs compared to other tools. Celera was also found to exhibit an average genome coverage of 12 % in E. coli dataset and 70 % in Yeast dataset as well as relatively lesser run times. In contrast, de Bruijn graph based assemblers Velvet and ABySS generated the assemblies of moderate quality, in less time when there is no limitation on the memory allocation, while greedy extension based algorithm SSAKE generated an assembly of very poor quality but with genome coverage of 90 % on yeast dataset. OLC can be considered as a favorable algorithmic framework for the development of assembler tools for Nanopore-based data, followed by de Bruijn based algorithms as they consume relatively less or similar run times as OLC-based algorithms for generating assembly, irrespective of the memory allocated for the task. However, few improvements must be made to the existing de Bruijn implementations in order to generate an assembly with reasonable quality. Our findings should help in stimulating the development of novel assemblers for handling Nanopore sequence data.

X Demographics

X Demographics

The data shown below were collected from the profiles of 9 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 157 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Brazil 3 2%
Netherlands 1 <1%
Germany 1 <1%
Sweden 1 <1%
United States 1 <1%
Unknown 150 96%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 36 23%
Student > Master 32 20%
Researcher 30 19%
Student > Bachelor 15 10%
Student > Doctoral Student 8 5%
Other 21 13%
Unknown 15 10%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 48 31%
Agricultural and Biological Sciences 46 29%
Computer Science 20 13%
Engineering 6 4%
Medicine and Dentistry 3 2%
Other 10 6%
Unknown 24 15%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 6. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 31 August 2016.
All research outputs
#5,743,586
of 23,498,099 outputs
Outputs from BMC Genomics
#2,285
of 10,787 outputs
Outputs of similar age
#89,496
of 345,863 outputs
Outputs of similar age from BMC Genomics
#54
of 273 outputs
Altmetric has tracked 23,498,099 research outputs across all sources so far. Compared to these this one has done well and is in the 75th percentile: it's in the top 25% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 10,787 research outputs from this source. They receive a mean Attention Score of 4.7. This one has done well, scoring higher than 78% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 345,863 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 73% of its contemporaries.
We're also able to compare this research output to 273 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 79% of its contemporaries.