↓ Skip to main content

KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily

Overview of attention for article published in BMC Genomics, June 2016
Altmetric Badge

About this Attention Score

  • Good Attention Score compared to outputs of the same age (71st percentile)
  • Good Attention Score compared to outputs of the same age and source (70th percentile)

Mentioned by

twitter
8 X users

Citations

dimensions_citation
10 Dimensions

Readers on

mendeley
62 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily
Published in
BMC Genomics, June 2016
DOI 10.1186/s12864-016-2723-1
Pubmed ID
Authors

Tirso Pons, Miguel Vazquez, María Luisa Matey-Hernandez, Søren Brunak, Alfonso Valencia, Jose MG Izarzugaza

Abstract

The association between aberrant signal processing by protein kinases and human diseases such as cancer was established long time ago. However, understanding the link between sequence variants in the protein kinase superfamily and the mechanistic complex traits at the molecular level remains challenging: cells tolerate most genomic alterations and only a minor fraction disrupt molecular function sufficiently and drive disease. KinMutRF is a novel random-forest method to automatically identify pathogenic variants in human kinases. Twenty six decision trees implemented as a random forest ponder a battery of features that characterize the variants: a) at the gene level, including membership to a Kinbase group and Gene Ontology terms; b) at the PFAM domain level; and c) at the residue level, the types of amino acids involved, changes in biochemical properties, functional annotations from UniProt, Phospho.ELM and FireDB. KinMutRF identifies disease-associated variants satisfactorily (Acc: 0.88, Prec:0.82, Rec:0.75, F-score:0.78, MCC:0.68) when trained and cross-validated with the 3689 human kinase variants from UniProt that have been annotated as neutral or pathogenic. All unclassified variants were excluded from the training set. Furthermore, KinMutRF is discussed with respect to two independent kinase-specific sets of mutations no included in the training and testing, Kin-Driver (643 variants) and Pon-BTK (1495 variants). Moreover, we provide predictions for the 848 protein kinase variants in UniProt that remained unclassified. A public implementation of KinMutRF, including documentation and examples, is available online ( http://kinmut2.bioinfo.cnio.es ). The source code for local installation is released under a GPL version 3 license, and can be downloaded from https://github.com/Rbbt-Workflows/KinMut2 . KinMutRF is capable of classifying kinase variation with good performance. Predictions by KinMutRF compare favorably in a benchmark with other state-of-the-art methods (i.e. SIFT, Polyphen-2, MutationAssesor, MutationTaster, LRT, CADD, FATHMM, and VEST). Kinase-specific features rank as the most elucidatory in terms of information gain and are likely the improvement in prediction performance. This advocates for the development of family-specific classifiers able to exploit the discriminatory power of features unique to individual protein families.

X Demographics

X Demographics

The data shown below were collected from the profiles of 8 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 62 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Switzerland 1 2%
United Kingdom 1 2%
Denmark 1 2%
Spain 1 2%
Japan 1 2%
Unknown 57 92%

Demographic breakdown

Readers by professional status Count As %
Researcher 13 21%
Student > Ph. D. Student 9 15%
Student > Master 6 10%
Professor 4 6%
Librarian 3 5%
Other 15 24%
Unknown 12 19%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 11 18%
Medicine and Dentistry 9 15%
Agricultural and Biological Sciences 8 13%
Computer Science 5 8%
Psychology 3 5%
Other 11 18%
Unknown 15 24%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 5. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 25 June 2016.
All research outputs
#6,468,098
of 24,174,783 outputs
Outputs from BMC Genomics
#2,619
of 10,913 outputs
Outputs of similar age
#102,299
of 359,102 outputs
Outputs of similar age from BMC Genomics
#50
of 174 outputs
Altmetric has tracked 24,174,783 research outputs across all sources so far. This one has received more attention than most of these and is in the 73rd percentile.
So far Altmetric has tracked 10,913 research outputs from this source. They receive a mean Attention Score of 4.8. This one has done well, scoring higher than 75% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 359,102 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 71% of its contemporaries.
We're also able to compare this research output to 174 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 70% of its contemporaries.