↓ Skip to main content

Classifying publications from the clinical and translational science award program along the translational research spectrum: a machine learning approach

Overview of attention for article published in Journal of Translational Medicine, August 2016
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • Good Attention Score compared to outputs of the same age (73rd percentile)
  • High Attention Score compared to outputs of the same age and source (81st percentile)

Mentioned by

twitter
6 X users

Citations

dimensions_citation
43 Dimensions

Readers on

mendeley
71 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Classifying publications from the clinical and translational science award program along the translational research spectrum: a machine learning approach
Published in
Journal of Translational Medicine, August 2016
DOI 10.1186/s12967-016-0992-8
Pubmed ID
Authors

Alisa Surkis, Janice A. Hogle, Deborah DiazGranados, Joe D. Hunt, Paul E. Mazmanian, Emily Connors, Kate Westaby, Elizabeth C. Whipple, Trisha Adamus, Meridith Mueller, Yindalon Aphinyanaphongs

Abstract

Translational research is a key area of focus of the National Institutes of Health (NIH), as demonstrated by the substantial investment in the Clinical and Translational Science Award (CTSA) program. The goal of the CTSA program is to accelerate the translation of discoveries from the bench to the bedside and into communities. Different classification systems have been used to capture the spectrum of basic to clinical to population health research, with substantial differences in the number of categories and their definitions. Evaluation of the effectiveness of the CTSA program and of translational research in general is hampered by the lack of rigor in these definitions and their application. This study adds rigor to the classification process by creating a checklist to evaluate publications across the translational spectrum and operationalizes these classifications by building machine learning-based text classifiers to categorize these publications. Based on collaboratively developed definitions, we created a detailed checklist for categories along the translational spectrum from T0 to T4. We applied the checklist to CTSA-linked publications to construct a set of coded publications for use in training machine learning-based text classifiers to classify publications within these categories. The training sets combined T1/T2 and T3/T4 categories due to low frequency of these publication types compared to the frequency of T0 publications. We then compared classifier performance across different algorithms and feature sets and applied the classifiers to all publications in PubMed indexed to CTSA grants. To validate the algorithm, we manually classified the articles with the top 100 scores from each classifier. The definitions and checklist facilitated classification and resulted in good inter-rater reliability for coding publications for the training set. Very good performance was achieved for the classifiers as represented by the area under the receiver operating curves (AUC), with an AUC of 0.94 for the T0 classifier, 0.84 for T1/T2, and 0.92 for T3/T4. The combination of definitions agreed upon by five CTSA hubs, a checklist that facilitates more uniform definition interpretation, and algorithms that perform well in classifying publications along the translational spectrum provide a basis for establishing and applying uniform definitions of translational research categories. The classification algorithms allow publication analyses that would not be feasible with manual classification, such as assessing the distribution and trends of publications across the CTSA network and comparing the categories of publications and their citations to assess knowledge transfer across the translational research spectrum.

X Demographics

X Demographics

The data shown below were collected from the profiles of 6 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 71 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 1 1%
Unknown 70 99%

Demographic breakdown

Readers by professional status Count As %
Researcher 11 15%
Student > Master 8 11%
Student > Doctoral Student 8 11%
Student > Bachelor 7 10%
Student > Ph. D. Student 6 8%
Other 18 25%
Unknown 13 18%
Readers by discipline Count As %
Medicine and Dentistry 19 27%
Nursing and Health Professions 5 7%
Social Sciences 5 7%
Computer Science 5 7%
Agricultural and Biological Sciences 4 6%
Other 15 21%
Unknown 18 25%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 4. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 30 August 2016.
All research outputs
#5,828,064
of 23,577,654 outputs
Outputs from Journal of Translational Medicine
#915
of 4,185 outputs
Outputs of similar age
#97,743
of 369,295 outputs
Outputs of similar age from Journal of Translational Medicine
#15
of 83 outputs
Altmetric has tracked 23,577,654 research outputs across all sources so far. Compared to these this one has done well and is in the 75th percentile: it's in the top 25% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 4,185 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 10.6. This one has done well, scoring higher than 78% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 369,295 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 73% of its contemporaries.
We're also able to compare this research output to 83 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 81% of its contemporaries.