↓ Skip to main content

Extracting a stroke phenotype risk factor from Veteran Health Administration clinical reports: an information content analysis

Overview of attention for article published in Journal of Biomedical Semantics, May 2016
Altmetric Badge

Citations

dimensions_citation
39 Dimensions

Readers on

mendeley
89 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Extracting a stroke phenotype risk factor from Veteran Health Administration clinical reports: an information content analysis
Published in
Journal of Biomedical Semantics, May 2016
DOI 10.1186/s13326-016-0065-1
Pubmed ID
Authors

Danielle L. Mowery, Brian E. Chapman, Mike Conway, Brett R. South, Erin Madden, Salomeh Keyhani, Wendy W. Chapman

Abstract

In the United States, 795,000 people suffer strokes each year; 10-15 % of these strokes can be attributed to stenosis caused by plaque in the carotid artery, a major stroke phenotype risk factor. Studies comparing treatments for the management of asymptomatic carotid stenosis are challenging for at least two reasons: 1) administrative billing codes (i.e., Current Procedural Terminology (CPT) codes) that identify carotid images do not denote which neurovascular arteries are affected and 2) the majority of the image reports are negative for carotid stenosis. Studies that rely on manual chart abstraction can be labor-intensive, expensive, and time-consuming. Natural Language Processing (NLP) can expedite the process of manual chart abstraction by automatically filtering reports with no/insignificant carotid stenosis findings and flagging reports with significant carotid stenosis findings; thus, potentially reducing effort, costs, and time. In this pilot study, we conducted an information content analysis of carotid stenosis mentions in terms of their report location (Sections), report formats (structures) and linguistic descriptions (expressions) from Veteran Health Administration free-text reports. We assessed an NLP algorithm, pyConText's, ability to discern reports with significant carotid stenosis findings from reports with no/insignificant carotid stenosis findings given these three document composition factors for two report types: radiology (RAD) and text integration utility (TIU) notes. We observed that most carotid mentions are recorded in prose using categorical expressions, within the Findings and Impression sections for RAD reports and within neither of these designated sections for TIU notes. For RAD reports, pyConText performed with high sensitivity (88 %), specificity (84 %), and negative predictive value (95 %) and reasonable positive predictive value (70 %). For TIU notes, pyConText performed with high specificity (87 %) and negative predictive value (92 %), reasonable sensitivity (73 %), and moderate positive predictive value (58 %). pyConText performed with the highest sensitivity processing the full report rather than the Findings or Impressions independently. We conclude that pyConText can reduce chart review efforts by filtering reports with no/insignificant carotid stenosis findings and flagging reports with significant carotid stenosis findings from the Veteran Health Administration electronic health record, and hence has utility for expediting a comparative effectiveness study of treatment strategies for stroke prevention.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 89 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 89 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 18 20%
Student > Ph. D. Student 12 13%
Student > Master 9 10%
Student > Bachelor 5 6%
Other 5 6%
Other 14 16%
Unknown 26 29%
Readers by discipline Count As %
Medicine and Dentistry 23 26%
Computer Science 14 16%
Nursing and Health Professions 4 4%
Agricultural and Biological Sciences 2 2%
Biochemistry, Genetics and Molecular Biology 2 2%
Other 13 15%
Unknown 31 35%