↓ Skip to main content

NoGOA: predicting noisy GO annotations using evidences and sparse representation

Overview of attention for article published in BMC Bioinformatics, July 2017
Altmetric Badge

Mentioned by

twitter
1 X user

Citations

dimensions_citation
12 Dimensions

Readers on

mendeley
16 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
NoGOA: predicting noisy GO annotations using evidences and sparse representation
Published in
BMC Bioinformatics, July 2017
DOI 10.1186/s12859-017-1764-z
Pubmed ID
Authors

Guoxian Yu, Chang Lu, Jun Wang

Abstract

Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 16 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 16 100%

Demographic breakdown

Readers by professional status Count As %
Student > Bachelor 5 31%
Researcher 5 31%
Student > Ph. D. Student 3 19%
Student > Master 1 6%
Professor > Associate Professor 1 6%
Other 0 0%
Unknown 1 6%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 6 38%
Computer Science 3 19%
Engineering 2 13%
Agricultural and Biological Sciences 1 6%
Economics, Econometrics and Finance 1 6%
Other 1 6%
Unknown 2 13%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 24 July 2017.
All research outputs
#20,436,330
of 22,990,068 outputs
Outputs from BMC Bioinformatics
#6,886
of 7,311 outputs
Outputs of similar age
#274,431
of 314,579 outputs
Outputs of similar age from BMC Bioinformatics
#83
of 95 outputs
Altmetric has tracked 22,990,068 research outputs across all sources so far. This one is in the 1st percentile – i.e., 1% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,311 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 1st percentile – i.e., 1% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 314,579 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 1st percentile – i.e., 1% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 95 others from the same source and published within six weeks on either side of this one. This one is in the 1st percentile – i.e., 1% of its contemporaries scored the same or lower than it.