↓ Skip to main content

Can human experts predict solubility better than computers?

Overview of attention for article published in Journal of Cheminformatics, December 2017
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (89th percentile)
  • High Attention Score compared to outputs of the same age and source (80th percentile)

Mentioned by

blogs
1 blog
twitter
16 X users

Citations

dimensions_citation
48 Dimensions

Readers on

mendeley
106 Mendeley
citeulike
1 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Can human experts predict solubility better than computers?
Published in
Journal of Cheminformatics, December 2017
DOI 10.1186/s13321-017-0250-y
Pubmed ID
Authors

Samuel Boobier, Anne Osbourn, John B. O. Mitchell

Abstract

In this study, we design and carry out a survey, asking human experts to predict the aqueous solubility of druglike organic compounds. We investigate whether these experts, drawn largely from the pharmaceutical industry and academia, can match or exceed the predictive power of algorithms. Alongside this, we implement 10 typical machine learning algorithms on the same dataset. The best algorithm, a variety of neural network known as a multi-layer perceptron, gave an RMSE of 0.985 log S units and an R2 of 0.706. We would not have predicted the relative success of this particular algorithm in advance. We found that the best individual human predictor generated an almost identical prediction quality with an RMSE of 0.942 log S units and an R2 of 0.723. The collection of algorithms contained a higher proportion of reasonably good predictors, nine out of ten compared with around half of the humans. We found that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median generated excellent predictivity. While our consensus human predictor achieved very slightly better headline figures on various statistical measures, the difference between it and the consensus machine learning predictor was both small and statistically insignificant. We conclude that human experts can predict the aqueous solubility of druglike molecules essentially equally well as machine learning algorithms. We find that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median is a powerful way of benefitting from the wisdom of crowds.

X Demographics

X Demographics

The data shown below were collected from the profiles of 16 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 106 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 106 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 18 17%
Student > Ph. D. Student 15 14%
Student > Master 10 9%
Student > Bachelor 7 7%
Student > Postgraduate 6 6%
Other 19 18%
Unknown 31 29%
Readers by discipline Count As %
Chemistry 20 19%
Computer Science 9 8%
Chemical Engineering 6 6%
Pharmacology, Toxicology and Pharmaceutical Science 6 6%
Agricultural and Biological Sciences 6 6%
Other 19 18%
Unknown 40 38%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 16. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 26 October 2020.
All research outputs
#2,247,306
of 25,081,285 outputs
Outputs from Journal of Cheminformatics
#192
of 942 outputs
Outputs of similar age
#49,242
of 451,331 outputs
Outputs of similar age from Journal of Cheminformatics
#4
of 15 outputs
Altmetric has tracked 25,081,285 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 91st percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 942 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 10.2. This one has done well, scoring higher than 79% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 451,331 tracked outputs that were published within six weeks on either side of this one in any source. This one has done well, scoring higher than 89% of its contemporaries.
We're also able to compare this research output to 15 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 80% of its contemporaries.