↓ Skip to main content

Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains

Overview of attention for article published in BMC Bioinformatics, September 2015
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
14 Dimensions

Readers on

mendeley
34 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains
Published in
BMC Bioinformatics, September 2015
DOI 10.1186/s12859-015-0732-8
Pubmed ID
Authors

Li C. Xia, Dongmei Ai, Jacob A. Cram, Xiaoyi Liang, Jed A. Fuhrman, Fengzhu Sun

Abstract

Local trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies. By extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns. The software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa .

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 34 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
India 1 3%
Unknown 33 97%

Demographic breakdown

Readers by professional status Count As %
Researcher 11 32%
Student > Master 6 18%
Student > Ph. D. Student 5 15%
Other 3 9%
Professor 3 9%
Other 4 12%
Unknown 2 6%
Readers by discipline Count As %
Agricultural and Biological Sciences 13 38%
Biochemistry, Genetics and Molecular Biology 3 9%
Computer Science 3 9%
Mathematics 2 6%
Engineering 2 6%
Other 5 15%
Unknown 6 18%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 22 September 2015.
All research outputs
#18,345,702
of 23,577,761 outputs
Outputs from BMC Bioinformatics
#6,094
of 7,418 outputs
Outputs of similar age
#186,795
of 275,771 outputs
Outputs of similar age from BMC Bioinformatics
#120
of 149 outputs
Altmetric has tracked 23,577,761 research outputs across all sources so far. This one is in the 19th percentile – i.e., 19% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,418 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 12th percentile – i.e., 12% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 275,771 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 27th percentile – i.e., 27% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 149 others from the same source and published within six weeks on either side of this one. This one is in the 10th percentile – i.e., 10% of its contemporaries scored the same or lower than it.