↓ Skip to main content

TraRECo: a greedy approach based de novo transcriptome assembler with read error correction using consensus matrix

Overview of attention for article published in BMC Genomics, September 2018
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
3 Dimensions

Readers on

mendeley
20 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
TraRECo: a greedy approach based de novo transcriptome assembler with read error correction using consensus matrix
Published in
BMC Genomics, September 2018
DOI 10.1186/s12864-018-5034-x
Pubmed ID
Authors

Seokhyun Yoon, Daeseung Kim, Keunsoo Kang, Woong June Park

Abstract

The challenges when developing a good de novo transcriptome assembler include how to deal with read errors and sequence repeats. Almost all de novo assemblers utilize a de Bruijn graph, with which complexity grows linearly with data size while suffering from errors and repeats. Although one can correct the errors by inspecting the topological structure of the graph, this is not an easy task when there are too many branches. Two research directions are to improve either the graph reliability or the path search precision, and in this study, we focused on the former. We present TraRECo, a greedy approach to de novo assembly employing error-aware graph construction. In the proposed approach, we built contigs by direct read alignment within a distance margin and performed a junction search to construct splicing graphs. While doing so, a contig of length l was represented by a 4 × l matrix (called a consensus matrix), in which each element was the base count of the aligned reads so far. A representative sequence was obtained by taking the majority in each column of the consensus matrix to be used for further read alignment. Once the splicing graphs had been obtained, we used IsoLasso to find paths with a noticeable read depth. The experiments using real and simulated reads show that the method provided considerable improvement in sensitivity and moderately better performance when comparing sensitivity and precision. This was achieved by the error-aware graph construction using the consensus matrix, with which the reads having errors were made usable for the graph construction (otherwise, they might have been eventually discarded). This improved the quality of the coverage depth information used in the subsequent path search step and finally the reliability of the graph. De novo assembly is mainly used to explore undiscovered isoforms and must be able to represent as many reads as possible in an efficient way. In this sense, TraRECo provides us with a potential alternative for improving graph reliability even though the computational burden is much higher than the single k-mer in the de Bruijn graph approach.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 20 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 20 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 5 25%
Researcher 4 20%
Student > Bachelor 3 15%
Other 2 10%
Student > Doctoral Student 2 10%
Other 3 15%
Unknown 1 5%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 6 30%
Agricultural and Biological Sciences 4 20%
Medicine and Dentistry 3 15%
Engineering 2 10%
Sports and Recreations 1 5%
Other 2 10%
Unknown 2 10%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 06 September 2018.
All research outputs
#18,648,325
of 23,102,082 outputs
Outputs from BMC Genomics
#8,230
of 10,709 outputs
Outputs of similar age
#257,498
of 335,392 outputs
Outputs of similar age from BMC Genomics
#131
of 189 outputs
Altmetric has tracked 23,102,082 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.
So far Altmetric has tracked 10,709 research outputs from this source. They receive a mean Attention Score of 4.7. This one is in the 12th percentile – i.e., 12% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 335,392 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 12th percentile – i.e., 12% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 189 others from the same source and published within six weeks on either side of this one. This one is in the 15th percentile – i.e., 15% of its contemporaries scored the same or lower than it.