Title |
A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms
|
---|---|
Published in |
BMC Genomics, December 2017
|
DOI | 10.1186/s12864-017-4270-9 |
Pubmed ID | |
Authors |
Sing-Hoi Sze, Jonathan J. Parrott, Aaron M. Tarone |
Abstract |
While the continued development of high-throughput sequencing has facilitated studies of entire transcriptomes in non-model organisms, the incorporation of an increasing amount of RNA-Seq libraries has made de novo transcriptome assembly difficult. Although algorithms that can assemble a large amount of RNA-Seq data are available, they are generally very memory-intensive and can only be used to construct small assemblies. We develop a divide-and-conquer strategy that allows these algorithms to be utilized, by subdividing a large RNA-Seq data set into small libraries. Each individual library is assembled independently by an existing algorithm, and a merging algorithm is developed to combine these assemblies by picking a subset of high quality transcripts to form a large transcriptome. When compared to existing algorithms that return a single assembly directly, this strategy achieves comparable or increased accuracy as memory-efficient algorithms that can be used to process a large amount of RNA-Seq data, and comparable or decreased accuracy as memory-intensive algorithms that can only be used to construct small assemblies. Our divide-and-conquer strategy allows memory-intensive de novo transcriptome assembly algorithms to be utilized to construct large assemblies. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United Kingdom | 4 | 21% |
Canada | 1 | 5% |
United States | 1 | 5% |
Venezuela, Bolivarian Republic of | 1 | 5% |
France | 1 | 5% |
New Zealand | 1 | 5% |
Australia | 1 | 5% |
India | 1 | 5% |
Taiwan | 1 | 5% |
Other | 0 | 0% |
Unknown | 7 | 37% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 11 | 58% |
Scientists | 8 | 42% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 25 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 9 | 36% |
Student > Master | 5 | 20% |
Researcher | 3 | 12% |
Student > Doctoral Student | 3 | 12% |
Student > Bachelor | 2 | 8% |
Other | 1 | 4% |
Unknown | 2 | 8% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 6 | 24% |
Agricultural and Biological Sciences | 6 | 24% |
Computer Science | 4 | 16% |
Environmental Science | 2 | 8% |
Pharmacology, Toxicology and Pharmaceutical Science | 1 | 4% |
Other | 2 | 8% |
Unknown | 4 | 16% |