↓ Skip to main content

Generalising semantic category disambiguation with large lexical resources for fun and profit

Overview of attention for article published in Journal of Biomedical Semantics, June 2014
Altmetric Badge

Citations

dimensions_citation
2 Dimensions

Readers on

mendeley
22 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Generalising semantic category disambiguation with large lexical resources for fun and profit
Published in
Journal of Biomedical Semantics, June 2014
DOI 10.1186/2041-1480-5-26
Pubmed ID
Authors

Pontus Stenetorp, Sampo Pyysalo, Sophia Ananiadou, Jun’ichi Tsujii

Abstract

Semantic Category Disambiguation (SCD) is the task of assigning the appropriate semantic category to given spans of text from a fixed set of candidate categories, for example Protein to "Fibrin". SCD is relevant to Natural Language Processing tasks such as Named Entity Recognition, coreference resolution and coordination resolution. In this work, we study machine learning-based SCD methods using large lexical resources and approximate string matching, aiming to generalise these methods with regard to domains, lexical resources and the composition of data sets. We specifically consider the applicability of SCD for the purposes of supporting human annotators and acting as a pipeline component for other Natural Language Processing systems.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 22 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Mexico 1 5%
Netherlands 1 5%
China 1 5%
Australia 1 5%
Unknown 18 82%

Demographic breakdown

Readers by professional status Count As %
Researcher 5 23%
Student > Ph. D. Student 5 23%
Student > Master 3 14%
Lecturer > Senior Lecturer 2 9%
Student > Doctoral Student 1 5%
Other 4 18%
Unknown 2 9%
Readers by discipline Count As %
Computer Science 12 55%
Agricultural and Biological Sciences 4 18%
Linguistics 1 5%
Psychology 1 5%
Earth and Planetary Sciences 1 5%
Other 1 5%
Unknown 2 9%