Title |
An active learning-enabled annotation system for clinical named entity recognition
|
---|---|
Published in |
BMC Medical Informatics and Decision Making, July 2017
|
DOI | 10.1186/s12911-017-0466-9 |
Pubmed ID | |
Authors |
Yukun Chen, Thomas A. Lask, Qiaozhu Mei, Qingxia Chen, Sungrim Moon, Jingqi Wang, Ky Nguyen, Tolulola Dawodu, Trevor Cohen, Joshua C. Denny, Hua Xu |
Abstract |
Active learning (AL) has shown the promising potential to minimize the annotation cost while maximizing the performance in building statistical natural language processing (NLP) models. However, very few studies have investigated AL in a real-life setting in medical domain. In this study, we developed the first AL-enabled annotation system for clinical named entity recognition (NER) with a novel AL algorithm. Besides the simulation study to evaluate the novel AL algorithm, we further conducted user studies with two nurses using this system to assess the performance of AL in real world annotation processes for building clinical NER models. The simulation results show that the novel AL algorithm outperformed traditional AL algorithm and random sampling. However, the user study tells a different story that AL methods did not always perform better than random sampling for different users. We found that the increased information content of actively selected sentences is strongly offset by the increased time required to annotate them. Moreover, the annotation time was not considered in the querying algorithms. Our future work includes developing better AL algorithms with the estimation of annotation time and evaluating the system with larger number of users. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 1 | 100% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 1 | 100% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 80 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 14 | 18% |
Student > Bachelor | 11 | 14% |
Student > Master | 10 | 13% |
Researcher | 7 | 9% |
Student > Doctoral Student | 4 | 5% |
Other | 5 | 6% |
Unknown | 29 | 36% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 16 | 20% |
Engineering | 5 | 6% |
Medicine and Dentistry | 5 | 6% |
Nursing and Health Professions | 5 | 6% |
Agricultural and Biological Sciences | 3 | 4% |
Other | 11 | 14% |
Unknown | 35 | 44% |