↓ Skip to main content

RepeatAnalyzer: a tool for analysing and managing short-sequence repeat data

Overview of attention for article published in BMC Genomics, June 2016
Altmetric Badge

About this Attention Score

  • In the top 5% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (96th percentile)
  • High Attention Score compared to outputs of the same age and source (99th percentile)

Mentioned by

news
9 news outlets
twitter
9 X users

Citations

dimensions_citation
34 Dimensions

Readers on

mendeley
60 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
RepeatAnalyzer: a tool for analysing and managing short-sequence repeat data
Published in
BMC Genomics, June 2016
DOI 10.1186/s12864-016-2686-2
Pubmed ID
Authors

Helen N. Catanese, Kelly A. Brayton, Assefaw H. Gebremedhin

Abstract

Short-sequence repeats (SSRs) occur in both prokaryotic and eukaryotic DNA, inter- and intragenically, and may be exact or inexact copies. When heterogeneous SSRs are present in a given locus, we can take advantage of the pattern of different repeats to genotype strains based on the SSRs. Cataloguing and tracking these repeats can be difficult as diverse groups of researchers are involved in the identification of the repeats. Additionally, the task is error-prone when done manually. We developed RepeatAnalyzer, a new software tool capable of tracking, managing, analysing and cataloguing SSRs and genotypes using Anaplasma marginale as a model species. RepeatAnalyzer's analysis capability includes novel metrics for measuring regional genetic diversity (corresponding to variety and regularity of SSR occurrence). As a part of its visualization capabilities, RepeatAnalyzer produces high quality maps of the geographic distribution of genotypes or SSRs over a region of interest. RepeatAnalyzer's repeat identification functionality was validated for all SSRs and genotypes reported in 21 publications, using 380 A. marginale isolates gathered from the five publications within that list that provided access to their isolates. The tool produced accurate genotyping results in every case. In addition, it uncovered a number of errors in the published literature: 11 cases where SSRs were misreported, 5 cases where two different SSRs had been given the same name, and 16 cases where two or more names had been given to a single SSR. The analysis and visualization functionalities of the tool are demonstrated using several examples. RepeatAnalyzer is a robust software tool that can be used for storing, managing, and analysing short-sequence repeats for the purpose of strain identification. The tool can be used for any set of SSRs regardless of species. When applied to A. marginale, our test case, we show that genotype lengths for a given region follow a normal distribution, while SSR frequencies follow a power-law-like distribution. Further, we find that over 90 % of repeats are 28 to 29 amino acids long, which is in agreement with conventional wisdom. Lastly, our analysis reveals that the most common edit distance is five or six, which is counter-intuitive since we expected that result to be closer to one, resulting from the simplest change from one repeat to another.

X Demographics

X Demographics

The data shown below were collected from the profiles of 9 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 60 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 1 2%
Australia 1 2%
Unknown 58 97%

Demographic breakdown

Readers by professional status Count As %
Student > Master 16 27%
Researcher 16 27%
Student > Doctoral Student 5 8%
Student > Ph. D. Student 5 8%
Student > Bachelor 3 5%
Other 5 8%
Unknown 10 17%
Readers by discipline Count As %
Agricultural and Biological Sciences 21 35%
Biochemistry, Genetics and Molecular Biology 12 20%
Computer Science 5 8%
Veterinary Science and Veterinary Medicine 4 7%
Immunology and Microbiology 1 2%
Other 3 5%
Unknown 14 23%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 71. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 17 September 2016.
All research outputs
#545,980
of 23,881,329 outputs
Outputs from BMC Genomics
#56
of 10,793 outputs
Outputs of similar age
#11,422
of 342,792 outputs
Outputs of similar age from BMC Genomics
#2
of 188 outputs
Altmetric has tracked 23,881,329 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 97th percentile: it's in the top 5% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 10,793 research outputs from this source. They receive a mean Attention Score of 4.8. This one has done particularly well, scoring higher than 99% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 342,792 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 96% of its contemporaries.
We're also able to compare this research output to 188 others from the same source and published within six weeks on either side of this one. This one has done particularly well, scoring higher than 99% of its contemporaries.