↓ Skip to main content

An ensemble model of QSAR tools for regulatory risk assessment

Overview of attention for article published in Journal of Cheminformatics, September 2016
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
49 Dimensions

Readers on

mendeley
87 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
An ensemble model of QSAR tools for regulatory risk assessment
Published in
Journal of Cheminformatics, September 2016
DOI 10.1186/s13321-016-0164-0
Pubmed ID
Authors

Prachi Pradeep, Richard J. Povinelli, Shannon White, Stephen J. Merrill

Abstract

Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa (κ): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. This feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 87 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Bulgaria 1 1%
Brazil 1 1%
Unknown 85 98%

Demographic breakdown

Readers by professional status Count As %
Researcher 15 17%
Student > Master 13 15%
Student > Ph. D. Student 12 14%
Other 9 10%
Student > Bachelor 6 7%
Other 13 15%
Unknown 19 22%
Readers by discipline Count As %
Chemistry 20 23%
Pharmacology, Toxicology and Pharmaceutical Science 12 14%
Computer Science 6 7%
Biochemistry, Genetics and Molecular Biology 5 6%
Agricultural and Biological Sciences 4 5%
Other 14 16%
Unknown 26 30%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 23 September 2016.
All research outputs
#15,708,439
of 23,342,092 outputs
Outputs from Journal of Cheminformatics
#780
of 862 outputs
Outputs of similar age
#204,761
of 322,530 outputs
Outputs of similar age from Journal of Cheminformatics
#21
of 23 outputs
Altmetric has tracked 23,342,092 research outputs across all sources so far. This one is in the 22nd percentile – i.e., 22% of other outputs scored the same or lower than it.
So far Altmetric has tracked 862 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 11.0. This one is in the 5th percentile – i.e., 5% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 322,530 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 28th percentile – i.e., 28% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 23 others from the same source and published within six weeks on either side of this one. This one is in the 4th percentile – i.e., 4% of its contemporaries scored the same or lower than it.