↓ Skip to main content

Effect of the sequence data deluge on the performance of methods for detecting protein functional residues

Overview of attention for article published in BMC Bioinformatics, February 2018
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
2 Dimensions

Readers on

mendeley
14 Mendeley
citeulike
2 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Effect of the sequence data deluge on the performance of methods for detecting protein functional residues
Published in
BMC Bioinformatics, February 2018
DOI 10.1186/s12859-018-2084-7
Pubmed ID
Authors

Diego Garrido-Martín, Florencio Pazos

Abstract

The exponential accumulation of new sequences in public databases is expected to improve the performance of all the approaches for predicting protein structural and functional features. Nevertheless, this was never assessed or quantified for some widely used methodologies, such as those aimed at detecting functional sites and functional subfamilies in protein multiple sequence alignments. Using raw protein sequences as only input, these approaches can detect fully conserved positions, as well as those with a family-dependent conservation pattern. Both types of residues are routinely used as predictors of functional sites and, consequently, understanding how the sequence content of the databases affects them is relevant and timely. In this work we evaluate how the growth and change with time in the content of sequence databases affect five sequence-based approaches for detecting functional sites and subfamilies. We do that by recreating historical versions of the multiple sequence alignments that would have been obtained in the past based on the database contents at different time points, covering a period of 20 years. Applying the methods to these historical alignments allows quantifying the temporal variation in their performance. Our results show that the number of families to which these methods can be applied sharply increases with time, while their ability to detect potentially functional residues remains almost constant. These results are informative for the methods' developers and final users, and may have implications in the design of new sequencing initiatives.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 14 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 14 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 4 29%
Student > Bachelor 3 21%
Researcher 3 21%
Student > Master 1 7%
Professor > Associate Professor 1 7%
Other 0 0%
Unknown 2 14%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 4 29%
Agricultural and Biological Sciences 4 29%
Engineering 2 14%
Immunology and Microbiology 1 7%
Computer Science 1 7%
Other 0 0%
Unknown 2 14%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 05 March 2018.
All research outputs
#15,867,545
of 23,577,654 outputs
Outputs from BMC Bioinformatics
#5,494
of 7,400 outputs
Outputs of similar age
#212,663
of 331,261 outputs
Outputs of similar age from BMC Bioinformatics
#74
of 106 outputs
Altmetric has tracked 23,577,654 research outputs across all sources so far. This one is in the 22nd percentile – i.e., 22% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,400 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 17th percentile – i.e., 17% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 331,261 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 27th percentile – i.e., 27% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 106 others from the same source and published within six weeks on either side of this one. This one is in the 25th percentile – i.e., 25% of its contemporaries scored the same or lower than it.