↓ Skip to main content

Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data

Overview of attention for article published in BMC Genomics, August 2016
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age
  • Above-average Attention Score compared to outputs of the same age and source (52nd percentile)

Mentioned by

twitter
6 X users

Citations

dimensions_citation
10 Dimensions

Readers on

mendeley
33 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data
Published in
BMC Genomics, August 2016
DOI 10.1186/s12864-016-2896-7
Pubmed ID
Authors

Yongsheng Bai, Jeff Kinne, Brandon Donham, Feng Jiang, Lizhong Ding, Justin R. Hassler, Randal J. Kaufman

Abstract

Most existing tools for detecting next-generation sequencing-based splicing events focus on generic splicing events. Consequently, special types of non-canonical splicing events of short mRNA regions (IRE1α targeted) have not yet been thoroughly addressed at a genome-wide level using bioinformatics approaches in conjunction with next-generation technologies. During endoplasmic reticulum (ER) stress, the gene encoding the RNase Ire1α is known to splice out a short 26 nt region from the mRNA of the transcription factor Xbp1 non-canonically within the cytosol. This causes an open reading frame-shift that induces expression of many downstream genes in reaction to ER stress as part of the unfolded protein response (UPR). We previously published an algorithm termed "Read-Split-Walk" (RSW) to identify non-canonical splicing regions using RNA-Seq data and applied it to ER stress-induced Ire1α heterozygote and knockout mouse embryonic fibroblast cell lines. In this study, we have developed an improved algorithm "Read-Split-Run" (RSR) for detecting genome-wide Ire1α-targeted genes with non-canonical spliced regions at a faster speed. We applied the RSR algorithm using different combinations of several parameters to the previously RSW tested mouse embryonic fibroblast cells (MEF) and the human Encyclopedia of DNA Elements (ENCODE) RNA-Seq data. We also compared the performance of RSR with two other alternative splicing events identification tools (TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012)) utilizing the context of the spliced Xbp1 mRNA as a positive control in the data sets we identified it to be the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples and this comparison was also extended to human ENCODE RNA-Seq data. Proof of principle came in our results by the fact that the 26 nt non-conventional splice site in Xbp1 was detected as the top hit by our new RSR algorithm in heterozygote (Het) samples from both Thapsigargin (Tg) and Dithiothreitol (Dtt) treated experiments but absent in the negative control Ire1α knock-out (KO) samples. Applying different combinations of parameters to the mouse MEF RNA-Seq data, we suggest a General Linear Model (GLM) for both Tg and Dtt treated experiments. We also ran RSR for a human ENCODE RNA-Seq dataset and identified 32,597 spliced regions for regular chromosomes. TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012) identified 237,155 spliced junctions and 9,129 exon skipping events (excluding chr14), respectively. Our Read-Split-Run algorithm also outperformed others in the context of ranking Xbp1 gene as the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples. The RSR package including source codes is available at http://bioinf1.indstate.edu/RSR and its pipeline source codes are also freely available at https://github.com/xuric/read-split-run for academic use. Our new RSR algorithm has the capability of processing massive amounts of human ENCODE RNA-Seq data for identifying novel splice junction sites at a genome-wide level in a much more efficient manner when compared to the previous RSW algorithm. Our proposed model can also predict the number of spliced regions under any combinations of parameters. Our pipeline can detect novel spliced sites for other species using RNA-Seq data generated under similar conditions.

X Demographics

X Demographics

The data shown below were collected from the profiles of 6 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 33 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United Kingdom 1 3%
Unknown 32 97%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 10 30%
Researcher 8 24%
Student > Master 4 12%
Student > Doctoral Student 2 6%
Other 2 6%
Other 3 9%
Unknown 4 12%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 11 33%
Agricultural and Biological Sciences 11 33%
Computer Science 2 6%
Pharmacology, Toxicology and Pharmaceutical Science 1 3%
Veterinary Science and Veterinary Medicine 1 3%
Other 3 9%
Unknown 4 12%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 3. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 25 August 2016.
All research outputs
#13,900,658
of 23,577,761 outputs
Outputs from BMC Genomics
#5,124
of 10,800 outputs
Outputs of similar age
#186,425
of 345,964 outputs
Outputs of similar age from BMC Genomics
#120
of 273 outputs
Altmetric has tracked 23,577,761 research outputs across all sources so far. This one is in the 39th percentile – i.e., 39% of other outputs scored the same or lower than it.
So far Altmetric has tracked 10,800 research outputs from this source. They receive a mean Attention Score of 4.7. This one is in the 49th percentile – i.e., 49% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 345,964 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 44th percentile – i.e., 44% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 273 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 52% of its contemporaries.