Title |
A method for automatically extracting infectious disease-related primers and probes from the literature
|
---|---|
Published in |
BMC Bioinformatics, August 2010
|
DOI | 10.1186/1471-2105-11-410 |
Pubmed ID | |
Authors |
Miguel García-Remesal, Alejandro Cuevas, Victoria López-Alonso, Guillermo López-Campos, Guillermo de la Calle, Diana de la Iglesia, David Pérez-Rey, José Crespo, Fernando Martín-Sánchez, Víctor Maojo |
Abstract |
Primer and probe sequences are the main components of nucleic acid-based detection systems. Biologists use primers and probes for different tasks, some related to the diagnosis and prescription of infectious diseases. The biological literature is the main information source for empirically validated primer and probe sequences. Therefore, it is becoming increasingly important for researchers to navigate this important information. In this paper, we present a four-phase method for extracting and annotating primer/probe sequences from the literature. These phases are: (1) convert each document into a tree of paper sections, (2) detect the candidate sequences using a set of finite state machine-based recognizers, (3) refine problem sequences using a rule-based expert system, and (4) annotate the extracted sequences with their related organism/gene information. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Mexico | 2 | 6% |
China | 1 | 3% |
Brazil | 1 | 3% |
Unknown | 28 | 88% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 10 | 31% |
Student > Ph. D. Student | 8 | 25% |
Professor | 4 | 13% |
Other | 3 | 9% |
Student > Master | 2 | 6% |
Other | 2 | 6% |
Unknown | 3 | 9% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 11 | 34% |
Biochemistry, Genetics and Molecular Biology | 6 | 19% |
Computer Science | 4 | 13% |
Engineering | 3 | 9% |
Nursing and Health Professions | 1 | 3% |
Other | 4 | 13% |
Unknown | 3 | 9% |