↓ Skip to main content

Proteochemometric modelling coupled to in silico target prediction: an integrated approach for the simultaneous prediction of polypharmacology and binding affinity/potency of small molecules

Overview of attention for article published in Journal of Cheminformatics, April 2015
Altmetric Badge

Citations

dimensions_citation
32 Dimensions

Readers on

mendeley
108 Mendeley
citeulike
1 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Proteochemometric modelling coupled to in silico target prediction: an integrated approach for the simultaneous prediction of polypharmacology and binding affinity/potency of small molecules
Published in
Journal of Cheminformatics, April 2015
DOI 10.1186/s13321-015-0063-9
Pubmed ID
Authors

Shardul Paricharak, Isidro Cortés-Ciriano, Adriaan P IJzerman, Thérèse E Malliavin, Andreas Bender

Abstract

The rampant increase of public bioactivity databases has fostered the development of computational chemogenomics methodologies to evaluate potential ligand-target interactions (polypharmacology) both in a qualitative and quantitative way. Bayesian target prediction algorithms predict the probability of an interaction between a compound and a panel of targets, thus assessing compound polypharmacology qualitatively, whereas structure-activity relationship techniques are able to provide quantitative bioactivity predictions. We propose an integrated drug discovery pipeline combining in silico target prediction and proteochemometric modelling (PCM) for the respective prediction of compound polypharmacology and potency/affinity. The proposed pipeline was evaluated on the retrospective discovery of Plasmodium falciparum DHFR inhibitors. The qualitative in silico target prediction model comprised 553,084 ligand-target associations (a total of 262,174 compounds), covering 3,481 protein targets and used protein domain annotations to extrapolate predictions across species. The prediction of bioactivities for plasmodial DHFR led to a recall value of 79% and a precision of 100%, where the latter high value arises from the structural similarity of plasmodial DHFR inhibitors and T. gondii DHFR inhibitors in the training set. Quantitative PCM models were then trained on a dataset comprising 20 eukaryotic, protozoan and bacterial DHFR sequences, and 1,505 distinct compounds (in total 3,099 data points). The most predictive PCM model exhibited R (2) 0 test and RMSEtest values of 0.79 and 0.59 pIC50 units respectively, which was shown to outperform models based exclusively on compound (R (2) 0 test/RMSEtest = 0.63/0.78) and target information (R (2) 0 test/RMSEtest = 0.09/1.22), as well as inductive transfer knowledge between targets, with respective R (2) 0 test and RMSEtest values of 0.76 and 0.63 pIC50 units. Finally, both methods were integrated to predict the protein targets and the potency on plasmodial DHFR for the GSK TCAMS dataset, which comprises 13,533 compounds displaying strong anti-malarial activity. 534 of those compounds were identified as DHFR inhibitors by the target prediction algorithm, while the PCM algorithm identified 25 compounds, and 23 compounds (predicted pIC50 > 7) were identified by both methods. Overall, this integrated approach simultaneously provides target and potency/affinity predictions for small molecules. Graphical abstractProteochemometric modelling coupled to in silico target prediction.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 108 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Spain 2 2%
United Kingdom 2 2%
Germany 1 <1%
Italy 1 <1%
Denmark 1 <1%
United States 1 <1%
Unknown 100 93%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 20 19%
Researcher 18 17%
Student > Master 17 16%
Student > Bachelor 12 11%
Student > Doctoral Student 6 6%
Other 22 20%
Unknown 13 12%
Readers by discipline Count As %
Chemistry 25 23%
Computer Science 16 15%
Agricultural and Biological Sciences 12 11%
Medicine and Dentistry 10 9%
Biochemistry, Genetics and Molecular Biology 8 7%
Other 17 16%
Unknown 20 19%