↓ Skip to main content

A hybrid solution for extracting structured medical information from unstructured data in medical records via a double-reading/entry system

Overview of attention for article published in BMC Medical Informatics and Decision Making, August 2016
Altmetric Badge

Mentioned by

twitter
1 X user
facebook
1 Facebook page

Citations

dimensions_citation
31 Dimensions

Readers on

mendeley
154 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
A hybrid solution for extracting structured medical information from unstructured data in medical records via a double-reading/entry system
Published in
BMC Medical Informatics and Decision Making, August 2016
DOI 10.1186/s12911-016-0357-5
Pubmed ID
Authors

Ligang Luo, Liping Li, Jiajia Hu, Xiaozhe Wang, Boulin Hou, Tianze Zhang, Lue Ping Zhao

Abstract

Healthcare providers generate a huge amount of biomedical data stored in either legacy system (paper-based) format or electronic medical records (EMR) around the world, which are collectively referred to as big biomedical data (BBD). To realize the promise of BBD for clinical use and research, it is an essential step to extract key data elements from unstructured medical records into patient-centered electronic health records with computable data elements. Our objective is to introduce a novel solution, known as a double-reading/entry system (DRESS), for extracting clinical data from unstructured medical records (MR) and creating a semi-structured electronic health record database, as well as to demonstrate its reproducibility empirically. Utilizing the modern cloud-based technologies, we have developed a comprehensive system that includes multiple subsystems, from capturing MRs in clinics, to securely transferring MRs, storing and managing cloud-based MRs, to facilitating both machine learning and manual reading, and to performing iterative quality control before committing the semi-structured data into the desired database. To evaluate the reproducibility of extracted medical data elements by DRESS, we conduct a blinded reproducibility study, with 100 MRs from patients who have undergone surgical treatment of lung cancer in China. The study uses Kappa statistic to measure concordance of discrete variables, and uses correlation coefficient to measure reproducibility of continuous variables. Using the DRESS, we have demonstrated the feasibility of extracting clinical data from unstructured MRs to create semi-structured and patient-centered electronic health record database. The reproducibility study with 100 patient's MRs has shown an overall high reproducibility of 98 %, and varies across six modules (pathology, Radio/chemo therapy, clinical examination, surgery information, medical image and general patient information). DRESS uses a double-reading, double-entry, and an independent adjudication, to manually curate structured data elements from unstructured clinical data. Further, through distributed computing strategies, DRESS protects data privacy by dividing MR data into de-identified modules. Finally, through internet-based computing cloud, DRESS enables many data specialists to work in a virtual environment to achieve the necessary scale of processing thousands MRs within days. This hybrid system represents probably a workable solution to solve the big medical data challenge.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 154 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Ireland 1 <1%
Unknown 153 99%

Demographic breakdown

Readers by professional status Count As %
Researcher 27 18%
Student > Ph. D. Student 26 17%
Student > Master 22 14%
Student > Doctoral Student 13 8%
Student > Bachelor 12 8%
Other 23 15%
Unknown 31 20%
Readers by discipline Count As %
Computer Science 31 20%
Medicine and Dentistry 30 19%
Nursing and Health Professions 9 6%
Business, Management and Accounting 8 5%
Engineering 7 5%
Other 28 18%
Unknown 41 27%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 15 November 2016.
All research outputs
#18,956,502
of 23,491,765 outputs
Outputs from BMC Medical Informatics and Decision Making
#1,605
of 2,025 outputs
Outputs of similar age
#260,398
of 338,967 outputs
Outputs of similar age from BMC Medical Informatics and Decision Making
#30
of 38 outputs
Altmetric has tracked 23,491,765 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.
So far Altmetric has tracked 2,025 research outputs from this source. They receive a mean Attention Score of 5.0. This one is in the 8th percentile – i.e., 8% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 338,967 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 12th percentile – i.e., 12% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 38 others from the same source and published within six weeks on either side of this one. This one is in the 10th percentile – i.e., 10% of its contemporaries scored the same or lower than it.