↓ Skip to main content

Vindel: a simple pipeline for checking indel redundancy

Overview of attention for article published in BMC Bioinformatics, November 2014
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age
  • Above-average Attention Score compared to outputs of the same age and source (55th percentile)

Mentioned by

twitter
5 X users

Readers on

mendeley
16 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Vindel: a simple pipeline for checking indel redundancy
Published in
BMC Bioinformatics, November 2014
DOI 10.1186/s12859-014-0359-1
Pubmed ID
Authors

Zhiyi Li, Xiaowei Wu, Bin He, Liqing Zhang

Abstract

BackgroundWith the advance of next generation sequencing (NGS) technologies, a large number of insertion and deletion (indel) variants have been identified in human populations. Despite much research into variant calling, it has been found that a non-negligible proportion of the identified indel variants might be false positives due to sequencing errors, artifacts caused by ambiguous alignments, and annotation errors.ResultsIn this paper, we examine indel redundancy in dbSNP, one of the central databases for indel variants, and develop a standalone computational pipeline, dubbed Vindel, to detect redundant indels. The pipeline first applies indel position information to form candidate redundant groups, then performs indel mutations to the reference genome to generate corresponding indel variant substrings. Finally the indel variant substrings in the same candidate redundant groups are compared in a pairwise fashion to identify redundant indels. We applied our pipeline to check for redundancy in the human indels in dbSNP. Our pipeline identified approximately 8% redundancy in insertion type indels, 12% in deletion type indels, and overall 10% for insertions and deletions combined. These numbers are largely consistent across all human autosomes. We also investigated indel size distribution and adjacent indel distance distribution for a better understanding of the mechanisms generating indel variants.ConclusionsVindel, a simple yet effective computational pipeline, can be used to check whether a set of indels are redundant with respect to those already in the database of interest such as NCBI¿s dbSNP. Of the approximately 5.9 million indels we examined, nearly 0.6 million are redundant, revealing a serious limitation in the current indel annotation. Statistics results prove the consistency of the pipeline on indel redundancy detection for all 22 chromosomes. Apart from the standalone Vindel pipeline, the indel redundancy check algorithm is also implemented in the web server http://bioinformatics.cs.vt.edu/zhanglab/indelRedundant.php.

Timeline

Login to access the full chart related to this output.

If you don’t have an account, click here to discover Explorer

X Demographics

X Demographics

The data shown below were collected from the profiles of 5 X users who shared this research output. Click here to find out more about how the information was compiled.
As of 1 July 2024, you may notice a temporary increase in the numbers of X profiles with Unknown location. Click here to learn more.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 16 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Sweden 1 6%
France 1 6%
Switzerland 1 6%
Unknown 13 81%

Demographic breakdown

Readers by professional status Count As %
Researcher 8 50%
Student > Ph. D. Student 5 31%
Student > Bachelor 1 6%
Professor > Associate Professor 1 6%
Unknown 1 6%
Readers by discipline Count As %
Agricultural and Biological Sciences 6 38%
Biochemistry, Genetics and Molecular Biology 4 25%
Computer Science 3 19%
Mathematics 1 6%
Unknown 2 13%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 3. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 27 August 2015.
All research outputs
#14,152,800
of 24,666,614 outputs
Outputs from BMC Bioinformatics
#3,942
of 7,565 outputs
Outputs of similar age
#181,722
of 373,431 outputs
Outputs of similar age from BMC Bioinformatics
#62
of 137 outputs
Altmetric has tracked 24,666,614 research outputs across all sources so far. This one is in the 41st percentile – i.e., 41% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,565 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.5. This one is in the 45th percentile – i.e., 45% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 373,431 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 50% of its contemporaries.
We're also able to compare this research output to 137 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 55% of its contemporaries.