↓ Skip to main content

Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies

Overview of attention for article published in Microbiome, November 2016
Altmetric Badge

About this Attention Score

  • In the top 5% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (94th percentile)
  • Good Attention Score compared to outputs of the same age and source (76th percentile)

Mentioned by

blogs
1 blog
twitter
46 X users
patent
1 patent
facebook
2 Facebook pages
googleplus
1 Google+ user
reddit
1 Redditor
f1000
1 research highlight platform

Citations

dimensions_citation
144 Dimensions

Readers on

mendeley
386 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies
Published in
Microbiome, November 2016
DOI 10.1186/s40168-016-0208-8
Pubmed ID
Authors

Jonathan Thorsen, Asker Brejnrod, Martin Mortensen, Morten A. Rasmussen, Jakob Stokholm, Waleed Abu Al-Soud, Søren Sørensen, Hans Bisgaard, Johannes Waage

Abstract

There is an immense scientific interest in the human microbiome and its effects on human physiology, health, and disease. A common approach for examining bacterial communities is high-throughput sequencing of 16S rRNA gene hypervariable regions, aggregating sequence-similar amplicons into operational taxonomic units (OTUs). Strategies for detecting differential relative abundance of OTUs between sample conditions include classical statistical approaches as well as a plethora of newer methods, many borrowing from the related field of RNA-seq analysis. This effort is complicated by unique data characteristics, including sparsity, sequencing depth variation, and nonconformity of read counts to theoretical distributions, which is often exacerbated by exploratory and/or unbalanced study designs. Here, we assess the robustness of available methods for (1) inference in differential relative abundance analysis and (2) beta-diversity-based sample separation, using a rigorous benchmarking framework based on large clinical 16S microbiome datasets from different sources. Running more than 380,000 full differential relative abundance tests on real datasets with permuted case/control assignments and in silico-spiked OTUs, we identify large differences in method performance on a range of parameters, including false positive rates, sensitivity to sparsity and case/control balances, and spike-in retrieval rate. In large datasets, methods with the highest false positive rates also tend to have the best detection power. For beta-diversity-based sample separation, we show that library size normalization has very little effect and that the distance metric is the most important factor in terms of separation power. Our results, generalizable to datasets from different sequencing platforms, demonstrate how the choice of method considerably affects analysis outcome. Here, we give recommendations for tools that exhibit low false positive rates, have good retrieval power across effect sizes and case/control proportions, and have low sparsity bias. Result output from some commonly used methods should be interpreted with caution. We provide an easily extensible framework for benchmarking of new methods and future microbiome datasets.

X Demographics

X Demographics

The data shown below were collected from the profiles of 46 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 386 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 3 <1%
Japan 2 <1%
Canada 1 <1%
New Zealand 1 <1%
Denmark 1 <1%
Sweden 1 <1%
Belgium 1 <1%
Estonia 1 <1%
Unknown 375 97%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 99 26%
Researcher 83 22%
Student > Master 57 15%
Student > Bachelor 25 6%
Other 16 4%
Other 47 12%
Unknown 59 15%
Readers by discipline Count As %
Agricultural and Biological Sciences 145 38%
Biochemistry, Genetics and Molecular Biology 57 15%
Immunology and Microbiology 27 7%
Environmental Science 17 4%
Computer Science 16 4%
Other 62 16%
Unknown 62 16%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 37. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 30 June 2021.
All research outputs
#1,085,743
of 25,388,177 outputs
Outputs from Microbiome
#319
of 1,754 outputs
Outputs of similar age
#21,769
of 416,206 outputs
Outputs of similar age from Microbiome
#5
of 17 outputs
Altmetric has tracked 25,388,177 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 95th percentile: it's in the top 5% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 1,754 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 38.3. This one has done well, scoring higher than 81% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 416,206 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 94% of its contemporaries.
We're also able to compare this research output to 17 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 76% of its contemporaries.