↓ Skip to main content

Ontological interpretation of biomedical database content

Overview of attention for article published in Journal of Biomedical Semantics, June 2017
Altmetric Badge

Citations

dimensions_citation
8 Dimensions

Readers on

mendeley
31 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Ontological interpretation of biomedical database content
Published in
Journal of Biomedical Semantics, June 2017
DOI 10.1186/s13326-017-0127-z
Pubmed ID
Authors

Filipe Santana da Silva, Ludger Jansen, Fred Freitas, Stefan Schulz

Abstract

Biological databases store data about laboratory experiments, together with semantic annotations, in order to support data aggregation and retrieval. The exact meaning of such annotations in the context of a database record is often ambiguous. We address this problem by grounding implicit and explicit database content in a formal-ontological framework. By using a typical extract from the databases UniProt and Ensembl, annotated with content from GO, PR, ChEBI and NCBI Taxonomy, we created four ontological models (in OWL), which generate explicit, distinct interpretations under the BioTopLite2 (BTL2) upper-level ontology. The first three models interpret database entries as individuals (IND), defined classes (SUBC), and classes with dispositions (DISP), respectively; the fourth model (HYBR) is a combination of SUBC and DISP. For the evaluation of these four models, we consider (i) database content retrieval, using ontologies as query vocabulary; (ii) information completeness; and, (iii) DL complexity and decidability. The models were tested under these criteria against four competency questions (CQs). IND does not raise any ontological claim, besides asserting the existence of sample individuals and relations among them. Modelling patterns have to be created for each type of annotation referent. SUBC is interpreted regarding maximally fine-grained defined subclasses under the classes referred to by the data. DISP attempts to extract truly ontological statements from the database records, claiming the existence of dispositions. HYBR is a hybrid of SUBC and DISP and is more parsimonious regarding expressiveness and query answering complexity. For each of the four models, the four CQs were submitted as DL queries. This shows the ability to retrieve individuals with IND, and classes in SUBC and HYBR. DISP does not retrieve anything because the axioms with disposition are embedded in General Class Inclusion (GCI) statements. Ambiguity of biological database content is addressed by a method that identifies implicit knowledge behind semantic annotations in biological databases and grounds it in an expressive upper-level ontology. The result is a seamless representation of database structure, content and annotations as OWL models.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 31 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United Kingdom 1 3%
Mexico 1 3%
Germany 1 3%
Unknown 28 90%

Demographic breakdown

Readers by professional status Count As %
Researcher 7 23%
Student > Ph. D. Student 4 13%
Other 3 10%
Student > Bachelor 3 10%
Professor 2 6%
Other 4 13%
Unknown 8 26%
Readers by discipline Count As %
Computer Science 7 23%
Agricultural and Biological Sciences 5 16%
Engineering 3 10%
Biochemistry, Genetics and Molecular Biology 3 10%
Business, Management and Accounting 1 3%
Other 3 10%
Unknown 9 29%