↓ Skip to main content

Enabling network inference methods to handle missing data and outliers

Overview of attention for article published in BMC Bioinformatics, September 2015
Altmetric Badge

About this Attention Score

  • Above-average Attention Score compared to outputs of the same age (54th percentile)
  • Above-average Attention Score compared to outputs of the same age and source (54th percentile)

Mentioned by

twitter
4 X users

Citations

dimensions_citation
22 Dimensions

Readers on

mendeley
54 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Enabling network inference methods to handle missing data and outliers
Published in
BMC Bioinformatics, September 2015
DOI 10.1186/s12859-015-0717-7
Pubmed ID
Authors

Abel Folch-Fortuny, Alejandro F. Villaverde, Alberto Ferrer, Julio R. Banga

Abstract

The inference of complex networks from data is a challenging problem in biological sciences, as well as in a wide range of disciplines such as chemistry, technology, economics, or sociology. The quantity and quality of the data greatly affect the results. While many methodologies have been developed for this task, they seldom take into account issues such as missing data or outlier detection and correction, which need to be properly addressed before network inference. Here we present an approach to (i) handle missing data and (ii) detect and correct outliers based on multivariate projection to latent structures. The method, called trimmed scores regression (TSR), enables network inference methods to analyse incomplete datasets by imputing the missing values coherently with the latent data structure. Furthermore, it substitutes the faulty values in a dataset by proper estimations. We provide an implementation of this approach, and show how it can be integrated with any network inference method as a preliminary data curation step. This functionality is demonstrated with a state of the art network inference method based on mutual information distance and entropy reduction, MIDER. The methodology presented here enables network inference methods to analyse a large number of incomplete and faulty datasets that could not be reliably analysed so far. Our comparative studies show the superiority of TSR over other missing data approaches used by practitioners. Furthermore, the method allows for outlier detection and correction.

X Demographics

X Demographics

The data shown below were collected from the profiles of 4 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 54 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Spain 1 2%
Brazil 1 2%
Unknown 52 96%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 14 26%
Researcher 9 17%
Student > Master 6 11%
Student > Bachelor 4 7%
Student > Doctoral Student 3 6%
Other 12 22%
Unknown 6 11%
Readers by discipline Count As %
Computer Science 10 19%
Agricultural and Biological Sciences 9 17%
Mathematics 4 7%
Engineering 4 7%
Medicine and Dentistry 3 6%
Other 15 28%
Unknown 9 17%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 3. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 18 September 2015.
All research outputs
#12,935,224
of 22,826,360 outputs
Outputs from BMC Bioinformatics
#3,788
of 7,287 outputs
Outputs of similar age
#118,321
of 266,946 outputs
Outputs of similar age from BMC Bioinformatics
#54
of 124 outputs
Altmetric has tracked 22,826,360 research outputs across all sources so far. This one is in the 42nd percentile – i.e., 42% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,287 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 45th percentile – i.e., 45% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 266,946 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 54% of its contemporaries.
We're also able to compare this research output to 124 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 54% of its contemporaries.