Title |
A generalizable NLP framework for fast development of pattern-based biomedical relation extraction systems
|
---|---|
Published in |
BMC Bioinformatics, August 2014
|
DOI | 10.1186/1471-2105-15-285 |
Pubmed ID | |
Authors |
Yifan Peng, Manabu Torii, Cathy H Wu, K Vijay-Shanker |
Abstract |
Text mining is increasingly used in the biomedical domain because of its ability to automatically gather information from large amount of scientific articles. One important task in biomedical text mining is relation extraction, which aims to identify designated relations among biological entities reported in literature. A relation extraction system achieving high performance is expensive to develop because of the substantial time and effort required for its design and implementation. Here, we report a novel framework to facilitate the development of a pattern-based biomedical relation extraction system. It has several unique design features: (1) leveraging syntactic variations possible in a language and automatically generating extraction patterns in a systematic manner, (2) applying sentence simplification to improve the coverage of extraction patterns, and (3) identifying referential relations between a syntactic argument of a predicate and the actual target expected in the relation extraction task. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
Spain | 1 | 20% |
Norway | 1 | 20% |
Canada | 1 | 20% |
Unknown | 2 | 40% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Scientists | 3 | 60% |
Members of the public | 2 | 40% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 2 | 4% |
Netherlands | 1 | 2% |
Spain | 1 | 2% |
France | 1 | 2% |
Japan | 1 | 2% |
Poland | 1 | 2% |
Unknown | 45 | 87% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 14 | 27% |
Researcher | 9 | 17% |
Student > Master | 7 | 13% |
Student > Doctoral Student | 5 | 10% |
Student > Bachelor | 5 | 10% |
Other | 7 | 13% |
Unknown | 5 | 10% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 28 | 54% |
Agricultural and Biological Sciences | 7 | 13% |
Biochemistry, Genetics and Molecular Biology | 4 | 8% |
Medicine and Dentistry | 3 | 6% |
Linguistics | 2 | 4% |
Other | 2 | 4% |
Unknown | 6 | 12% |