↓ Skip to main content

Protein-protein interface hot spots prediction based on a hybrid feature selection strategy

Overview of attention for article published in BMC Bioinformatics, January 2018
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
88 Dimensions

Readers on

mendeley
49 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Protein-protein interface hot spots prediction based on a hybrid feature selection strategy
Published in
BMC Bioinformatics, January 2018
DOI 10.1186/s12859-018-2009-5
Pubmed ID
Authors

Yanhua Qiao, Yi Xiong, Hongyun Gao, Xiaolei Zhu, Peng Chen

Abstract

Hot spots are interface residues that contribute most binding affinity to protein-protein interaction. A compact and relevant feature subset is important for building machine learning methods to predict hot spots on protein-protein interfaces. Although different methods have been used to detect the relevant feature subset from a variety of features related to interface residues, it is still a challenge to detect the optimal feature subset for building the final model. In this study, three different feature selection methods were compared to propose a new hybrid feature selection strategy. This new strategy was proved to effectively reduce the feature space when we were building the prediction models for identifying hotspot residues. It was tested on eighty-two features, both conventional and newly proposed. According to the strategy, combining the feature subsets selected by decision tree and mRMR (maximum Relevance Minimum Redundancy) individually, we were able to build a model with 6 features by using a PSFS (Pseudo Sequential Forward Selection) process. Compared with other state-of-art methods for the independent test set, our model had shown better or comparable predictive performances (with F-measure 0.622 and recall 0.821). Analysis of the 6 features confirmed that our newly proposed feature CNSV_REL1 was important for our model. The analysis also showed that the complementarity between features should be considered as an important aspect when conducting the feature selection. In this study, most important of all, a new strategy for feature selection was proposed and proved to be effective in selecting the optimal feature subset for building prediction models, which can be used to predict hot spot residues on protein-protein interfaces. Moreover, two aspects, the generalization of the single feature and the complementarity between features, were proved to be of great importance and should be considered in feature selection methods. Finally, our newly proposed feature CNSV_REL1 had been proved an alternative and effective feature in predicting hot spots by our study. Our model is available for users through a webserver: http://zhulab.ahu.edu.cn/iPPHOT/ .

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 49 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 49 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 12 24%
Student > Master 9 18%
Researcher 6 12%
Professor 4 8%
Other 3 6%
Other 6 12%
Unknown 9 18%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 12 24%
Chemistry 8 16%
Engineering 5 10%
Medicine and Dentistry 3 6%
Computer Science 2 4%
Other 7 14%
Unknown 12 24%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 16 January 2018.
All research outputs
#15,867,545
of 23,577,654 outputs
Outputs from BMC Bioinformatics
#5,494
of 7,400 outputs
Outputs of similar age
#293,316
of 476,553 outputs
Outputs of similar age from BMC Bioinformatics
#81
of 125 outputs
Altmetric has tracked 23,577,654 research outputs across all sources so far. This one is in the 22nd percentile – i.e., 22% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,400 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 17th percentile – i.e., 17% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 476,553 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 29th percentile – i.e., 29% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 125 others from the same source and published within six weeks on either side of this one. This one is in the 28th percentile – i.e., 28% of its contemporaries scored the same or lower than it.