Report for: Enabling network inference methods to handle missing data and outliers

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	Enabling network inference methods to handle missing data and outliers
Published in	BMC Bioinformatics, September 2015
DOI	10.1186/s12859-015-0717-7
Pubmed ID	26335628
Authors	Abel Folch-Fortuny, Alejandro F. Villaverde, Alberto Ferrer, Julio R. Banga
Abstract	The inference of complex networks from data is a challenging problem in biological sciences, as well as in a wide range of disciplines such as chemistry, technology, economics, or sociology. The quantity and quality of the data greatly affect the results. While many methodologies have been developed for this task, they seldom take into account issues such as missing data or outlier detection and correction, which need to be properly addressed before network inference. Here we present an approach to (i) handle missing data and (ii) detect and correct outliers based on multivariate projection to latent structures. The method, called trimmed scores regression (TSR), enables network inference methods to analyse incomplete datasets by imputing the missing values coherently with the latent data structure. Furthermore, it substitutes the faulty values in a dataset by proper estimations. We provide an implementation of this approach, and show how it can be integrated with any network inference method as a preliminary data curation step. This functionality is demonstrated with a state of the art network inference method based on mutual information distance and entropy reduction, MIDER. The methodology presented here enables network inference methods to analyse a large number of incomplete and faulty datasets that could not be reliably analysed so far. Our comparative studies show the superiority of TSR over other missing data approaches used by practitioners. Furthermore, the method allows for outlier detection and correction.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 4 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United States	1	25%
Italy	1	25%
Spain	1	25%
Unknown	1	25%

Demographic breakdown

Type	Count	As %
Members of the public	3	75%
Scientists	1	25%

Mendeley readers

The data shown below were compiled from readership statistics for 54 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
Spain	1	2%
Brazil	1	2%
Unknown	52	96%

Demographic breakdown

Readers by professional status	Count	As %
Student > Ph. D. Student	14	26%
Researcher	9	17%
Student > Master	6	11%
Student > Bachelor	4	7%
Student > Doctoral Student	3	6%
Other	12	22%
Unknown	6	11%

Readers by discipline	Count	As %
Computer Science	10	19%
Agricultural and Biological Sciences	9	17%
Mathematics	4	7%
Engineering	4	7%
Medicine and Dentistry	3	6%
Other	15	28%
Unknown	9	17%

Attention Score in Context

This research output has an Altmetric Attention Score of 3. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 18 September 2015.

All research outputs

#12,935,224

of 22,826,360 outputs

Outputs from BMC Bioinformatics

#3,788

of 7,287 outputs

Outputs of similar age

#118,321

of 266,946 outputs

Outputs of similar age from BMC Bioinformatics

#54

of 124 outputs

Altmetric has tracked 22,826,360 research outputs across all sources so far. This one is in the 42nd percentile – i.e., 42% of other outputs scored the same or lower than it.

So far Altmetric has tracked 7,287 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 45th percentile – i.e., 45% of its peers scored the same or lower than it.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 266,946 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 54% of its contemporaries.

We're also able to compare this research output to 124 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 54% of its contemporaries.

Enabling network inference methods to handle missing data and outliers

About this Attention Score

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context