Title |
Generalising semantic category disambiguation with large lexical resources for fun and profit
|
---|---|
Published in |
Journal of Biomedical Semantics, June 2014
|
DOI | 10.1186/2041-1480-5-26 |
Pubmed ID | |
Authors |
Pontus Stenetorp, Sampo Pyysalo, Sophia Ananiadou, Jun’ichi Tsujii |
Abstract |
Semantic Category Disambiguation (SCD) is the task of assigning the appropriate semantic category to given spans of text from a fixed set of candidate categories, for example Protein to "Fibrin". SCD is relevant to Natural Language Processing tasks such as Named Entity Recognition, coreference resolution and coordination resolution. In this work, we study machine learning-based SCD methods using large lexical resources and approximate string matching, aiming to generalise these methods with regard to domains, lexical resources and the composition of data sets. We specifically consider the applicability of SCD for the purposes of supporting human annotators and acting as a pipeline component for other Natural Language Processing systems. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Mexico | 1 | 5% |
Netherlands | 1 | 5% |
China | 1 | 5% |
Australia | 1 | 5% |
Unknown | 18 | 82% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 5 | 23% |
Student > Ph. D. Student | 5 | 23% |
Student > Master | 3 | 14% |
Lecturer > Senior Lecturer | 2 | 9% |
Student > Doctoral Student | 1 | 5% |
Other | 4 | 18% |
Unknown | 2 | 9% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 12 | 55% |
Agricultural and Biological Sciences | 4 | 18% |
Linguistics | 1 | 5% |
Psychology | 1 | 5% |
Earth and Planetary Sciences | 1 | 5% |
Other | 1 | 5% |
Unknown | 2 | 9% |