↓ Skip to main content

Cross-validation pitfalls when selecting and assessing regression and classification models

Overview of attention for article published in Journal of Cheminformatics, March 2014
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (94th percentile)
  • High Attention Score compared to outputs of the same age and source (99th percentile)

Mentioned by

policy
1 policy source
twitter
27 X users
facebook
1 Facebook page
googleplus
1 Google+ user
reddit
1 Redditor
q&a
2 Q&A threads
video
1 YouTube creator

Citations

dimensions_citation
678 Dimensions

Readers on

mendeley
744 Mendeley
citeulike
2 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Cross-validation pitfalls when selecting and assessing regression and classification models
Published in
Journal of Cheminformatics, March 2014
DOI 10.1186/1758-2946-6-10
Pubmed ID
Authors

Damjan Krstajic, Ljubomir J Buturovic, David E Leahy, Simon Thomas

Abstract

We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches.

X Demographics

X Demographics

The data shown below were collected from the profiles of 27 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 744 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Germany 5 <1%
United States 4 <1%
Brazil 3 <1%
United Kingdom 3 <1%
Italy 1 <1%
Ecuador 1 <1%
Malaysia 1 <1%
Sweden 1 <1%
Finland 1 <1%
Other 6 <1%
Unknown 718 97%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 153 21%
Student > Master 124 17%
Researcher 112 15%
Student > Bachelor 61 8%
Student > Doctoral Student 41 6%
Other 106 14%
Unknown 147 20%
Readers by discipline Count As %
Computer Science 119 16%
Engineering 92 12%
Agricultural and Biological Sciences 67 9%
Chemistry 52 7%
Medicine and Dentistry 42 6%
Other 201 27%
Unknown 171 23%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 27. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 01 December 2021.
All research outputs
#1,363,772
of 24,751,485 outputs
Outputs from Journal of Cheminformatics
#76
of 923 outputs
Outputs of similar age
#13,461
of 230,672 outputs
Outputs of similar age from Journal of Cheminformatics
#1
of 22 outputs
Altmetric has tracked 24,751,485 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 94th percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 923 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 10.3. This one has done particularly well, scoring higher than 91% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 230,672 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 94% of its contemporaries.
We're also able to compare this research output to 22 others from the same source and published within six weeks on either side of this one. This one has done particularly well, scoring higher than 99% of its contemporaries.