Title |
Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings
|
---|---|
Published in |
BMC Bioinformatics, August 2014
|
DOI | 10.1186/1471-2105-15-266 |
Pubmed ID | |
Authors |
Anne-Dominique Pham, Aurélie Névéol, Thomas Lavergne, Daisuke Yasunaga, Olivier Clément, Guy Meyer, Rémy Morello, Anita Burgun |
Abstract |
Natural Language Processing (NLP) has been shown effective to analyze the content of radiology reports and identify diagnosis or patient characteristics. We evaluate the combination of NLP and machine learning to detect thromboembolic disease diagnosis and incidental clinically relevant findings from angiography and venography reports written in French. We model thromboembolic diagnosis and incidental findings as a set of concepts, modalities and relations between concepts that can be used as features by a supervised machine learning algorithm. A corpus of 573 radiology reports was de-identified and manually annotated with the support of NLP tools by a physician for relevant concepts, modalities and relations. A machine learning classifier was trained on the dataset interpreted by a physician for diagnosis of deep-vein thrombosis, pulmonary embolism and clinically relevant incidental findings. Decision models accounted for the imbalanced nature of the data and exploited the structure of the reports. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
Spain | 1 | 17% |
United Kingdom | 1 | 17% |
Norway | 1 | 17% |
Unknown | 3 | 50% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 3 | 50% |
Scientists | 3 | 50% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 3 | 2% |
India | 1 | <1% |
Brazil | 1 | <1% |
Japan | 1 | <1% |
Belgium | 1 | <1% |
Unknown | 116 | 94% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 22 | 18% |
Student > Master | 20 | 16% |
Student > Ph. D. Student | 16 | 13% |
Student > Bachelor | 9 | 7% |
Other | 6 | 5% |
Other | 22 | 18% |
Unknown | 28 | 23% |
Readers by discipline | Count | As % |
---|---|---|
Medicine and Dentistry | 36 | 29% |
Computer Science | 27 | 22% |
Engineering | 6 | 5% |
Linguistics | 2 | 2% |
Agricultural and Biological Sciences | 2 | 2% |
Other | 15 | 12% |
Unknown | 35 | 28% |