Report for: Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data
Published in	BMC Genomics, August 2016
DOI	10.1186/s12864-016-2896-7
Pubmed ID	27556805
Authors	Yongsheng Bai, Jeff Kinne, Brandon Donham, Feng Jiang, Lizhong Ding, Justin R. Hassler, Randal J. Kaufman
Abstract	Most existing tools for detecting next-generation sequencing-based splicing events focus on generic splicing events. Consequently, special types of non-canonical splicing events of short mRNA regions (IRE1α targeted) have not yet been thoroughly addressed at a genome-wide level using bioinformatics approaches in conjunction with next-generation technologies. During endoplasmic reticulum (ER) stress, the gene encoding the RNase Ire1α is known to splice out a short 26 nt region from the mRNA of the transcription factor Xbp1 non-canonically within the cytosol. This causes an open reading frame-shift that induces expression of many downstream genes in reaction to ER stress as part of the unfolded protein response (UPR). We previously published an algorithm termed "Read-Split-Walk" (RSW) to identify non-canonical splicing regions using RNA-Seq data and applied it to ER stress-induced Ire1α heterozygote and knockout mouse embryonic fibroblast cell lines. In this study, we have developed an improved algorithm "Read-Split-Run" (RSR) for detecting genome-wide Ire1α-targeted genes with non-canonical spliced regions at a faster speed. We applied the RSR algorithm using different combinations of several parameters to the previously RSW tested mouse embryonic fibroblast cells (MEF) and the human Encyclopedia of DNA Elements (ENCODE) RNA-Seq data. We also compared the performance of RSR with two other alternative splicing events identification tools (TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012)) utilizing the context of the spliced Xbp1 mRNA as a positive control in the data sets we identified it to be the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples and this comparison was also extended to human ENCODE RNA-Seq data. Proof of principle came in our results by the fact that the 26 nt non-conventional splice site in Xbp1 was detected as the top hit by our new RSR algorithm in heterozygote (Het) samples from both Thapsigargin (Tg) and Dithiothreitol (Dtt) treated experiments but absent in the negative control Ire1α knock-out (KO) samples. Applying different combinations of parameters to the mouse MEF RNA-Seq data, we suggest a General Linear Model (GLM) for both Tg and Dtt treated experiments. We also ran RSR for a human ENCODE RNA-Seq dataset and identified 32,597 spliced regions for regular chromosomes. TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012) identified 237,155 spliced junctions and 9,129 exon skipping events (excluding chr14), respectively. Our Read-Split-Run algorithm also outperformed others in the context of ranking Xbp1 gene as the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples. The RSR package including source codes is available at http://bioinf1.indstate.edu/RSR and its pipeline source codes are also freely available at https://github.com/xuric/read-split-run for academic use. Our new RSR algorithm has the capability of processing massive amounts of human ENCODE RNA-Seq data for identifying novel splice junction sites at a genome-wide level in a much more efficient manner when compared to the previous RSW algorithm. Our proposed model can also predict the number of spliced regions under any combinations of parameters. Our pipeline can detect novel spliced sites for other species using RNA-Seq data generated under similar conditions.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 6 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United States	1	17%
France	1	17%
Unknown	4	67%

Demographic breakdown

Type	Count	As %
Scientists	3	50%
Members of the public	3	50%

Mendeley readers

The data shown below were compiled from readership statistics for 33 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
United Kingdom	1	3%
Unknown	32	97%

Demographic breakdown

Readers by professional status	Count	As %
Student > Ph. D. Student	10	30%
Researcher	8	24%
Student > Master	4	12%
Student > Doctoral Student	2	6%
Other	2	6%
Other	3	9%
Unknown	4	12%

Readers by discipline	Count	As %
Biochemistry, Genetics and Molecular Biology	11	33%
Agricultural and Biological Sciences	11	33%
Computer Science	2	6%
Pharmacology, Toxicology and Pharmaceutical Science	1	3%
Veterinary Science and Veterinary Medicine	1	3%
Other	3	9%
Unknown	4	12%

Attention Score in Context

This research output has an Altmetric Attention Score of 3. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 25 August 2016.

All research outputs

#13,900,658

of 23,577,761 outputs

Outputs from BMC Genomics

#5,124

of 10,800 outputs

Outputs of similar age

#186,425

of 345,964 outputs

Outputs of similar age from BMC Genomics

#120

of 273 outputs

Altmetric has tracked 23,577,761 research outputs across all sources so far. This one is in the 39th percentile – i.e., 39% of other outputs scored the same or lower than it.

So far Altmetric has tracked 10,800 research outputs from this source. They receive a mean Attention Score of 4.7. This one is in the 49th percentile – i.e., 49% of its peers scored the same or lower than it.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 345,964 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 44th percentile – i.e., 44% of its contemporaries scored the same or lower than it.

We're also able to compare this research output to 273 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 52% of its contemporaries.

Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data

About this Attention Score

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context