↓ Skip to main content

MERIT reveals the impact of genomic context on sequencing error rate in ultra-deep applications

Overview of attention for article published in BMC Bioinformatics, June 2018
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age
  • Above-average Attention Score compared to outputs of the same age and source (52nd percentile)

Mentioned by

twitter
4 X users

Citations

dimensions_citation
5 Dimensions

Readers on

mendeley
18 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
MERIT reveals the impact of genomic context on sequencing error rate in ultra-deep applications
Published in
BMC Bioinformatics, June 2018
DOI 10.1186/s12859-018-2223-1
Pubmed ID
Authors

Mohammad Hadigol, Hossein Khiabanian

Abstract

Rapid progress in high-throughput sequencing (HTS) and the development of novel library preparation methods have improved the sensitivity of detecting mutations in heterogeneous samples, specifically in high-depth (> 500×) clinical applications. However, HTS methods are bounded by their technical and theoretical limitations and sequencing errors cannot be completely eliminated. Comprehensive quantification of the background noise can highlight both the efficiency and the limitations of any HTS methodology, and help differentiate true mutations at low abundance from artifacts. We introduce MERIT (Mutation Error Rate Inference Toolkit), designed for in-depth quantification of erroneous substitutions and small insertions and deletions. MERIT incorporates an all-inclusive variant caller and considers genomic context, including the nucleotides immediately at 5 'and 3 ', thereby establishing error rates for 96 possible substitutions as well as four single-base and 16 double-base indels. We applied MERIT to ultra-deep sequencing data (1,300,000 ×) obtained from the amplification of multiple clinically relevant loci, and showed a significant relationship between error rates and genomic contexts. In addition to observing significant difference between transversion and transition rates, we identified variations of more than 100-fold within each error type at high sequencing depths. For instance, T >G transversions in trinucleotide GTCs occurred 133.5 ± 65.9 more often than those in ATAs. Similarly, C >T transitions in GCGs were observed at 73.8 ± 10.5 higher rate than those in TCTs. We also devised an in silico approach to determine the optimal sequencing depth, where errors occur at rates similar to those of expected true mutations. Our analyses showed that increasing sequencing depth might improve sensitivity for detecting some mutations based on their genomic context. For example, T >G rate of error in GTCs did not change when sequenced beyond 10,000 ×; in contrast, T >G rate in TTAs consistently improved even at above 500,000 ×. Our results demonstrate significant variation in nucleotide misincorporation rates, and suggest that genomic context should be considered for comprehensive profiling of specimen-specific and sequencing artifacts in high-depth assays. This data provide strong evidence against assigning a single allele frequency threshold to call mutations, for it can result in substantial false positive as well as false negative variants, with important clinical consequences.

X Demographics

X Demographics

The data shown below were collected from the profiles of 4 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 18 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 18 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 5 28%
Other 2 11%
Student > Bachelor 1 6%
Student > Ph. D. Student 1 6%
Professor 1 6%
Other 2 11%
Unknown 6 33%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 5 28%
Agricultural and Biological Sciences 2 11%
Chemical Engineering 1 6%
Immunology and Microbiology 1 6%
Neuroscience 1 6%
Other 0 0%
Unknown 8 44%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 12 June 2018.
All research outputs
#13,806,113
of 23,394,907 outputs
Outputs from BMC Bioinformatics
#4,269
of 7,372 outputs
Outputs of similar age
#171,072
of 329,741 outputs
Outputs of similar age from BMC Bioinformatics
#51
of 107 outputs
Altmetric has tracked 23,394,907 research outputs across all sources so far. This one is in the 39th percentile – i.e., 39% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,372 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 38th percentile – i.e., 38% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 329,741 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 46th percentile – i.e., 46% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 107 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 52% of its contemporaries.