Report for: Fusing literature and full network data improves disease similarity computation

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	Fusing literature and full network data improves disease similarity computation
Published in	BMC Bioinformatics, August 2016
DOI	10.1186/s12859-016-1205-4
Pubmed ID	27578323
Authors	Ping Li, Yaling Nie, Jingkai Yu
Abstract	Identifying relatedness among diseases could help deepen understanding for the underlying pathogenic mechanisms of diseases, and facilitate drug repositioning projects. A number of methods for computing disease similarity had been developed; however, none of them were designed to utilize information of the entire protein interaction network, using instead only those interactions involving disease causing genes. Most of previously published methods required gene-disease association data, unfortunately, many diseases still have very few or no associated genes, which impeded broad adoption of those methods. In this study, we propose a new method (MedNetSim) for computing disease similarity by integrating medical literature and protein interaction network. MedNetSim consists of a network-based method (NetSim), which employs the entire protein interaction network, and a MEDLINE-based method (MedSim), which computes disease similarity by mining the biomedical literature. Among function-based methods, NetSim achieved the best performance. Its average AUC (area under the receiver operating characteristic curve) reached 95.2 %. MedSim, whose performance was even comparable to some function-based methods, acquired the highest average AUC in all semantic-based methods. Integration of MedSim and NetSim (MedNetSim) further improved the average AUC to 96.4 %. We further studied the effectiveness of different data sources. It was found that quality of protein interaction data was more important than its volume. On the contrary, higher volume of gene-disease association data was more beneficial, even with a lower reliability. Utilizing higher volume of disease-related gene data further improved the average AUC of MedNetSim and NetSim to 97.5 % and 96.7 %, respectively. Integrating biomedical literature and protein interaction network can be an effective way to compute disease similarity. Lacking sufficient disease-related gene data, literature-based methods such as MedSim can be a great addition to function-based algorithms. It may be beneficial to steer more resources torward studying gene-disease associations and improving the quality of protein interaction data. Disease similarities can be computed using the proposed methods at http:// www.digintelli.com:8000/ .

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 3 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
Unknown	3	100%

Demographic breakdown

Type	Count	As %
Members of the public	2	67%
Scientists	1	33%

Mendeley readers

The data shown below were compiled from readership statistics for 48 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
United States	1	2%
Netherlands	1	2%
South Africa	1	2%
Unknown	45	94%

Demographic breakdown

Readers by professional status	Count	As %
Researcher	11	23%
Student > Ph. D. Student	9	19%
Student > Master	6	13%
Student > Doctoral Student	4	8%
Professor > Associate Professor	3	6%
Other	7	15%
Unknown	8	17%

Readers by discipline	Count	As %
Agricultural and Biological Sciences	12	25%
Computer Science	10	21%
Biochemistry, Genetics and Molecular Biology	9	19%
Medicine and Dentistry	2	4%
Pharmacology, Toxicology and Pharmaceutical Science	1	2%
Other	5	10%
Unknown	9	19%

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 01 September 2016.

All research outputs

#15,381,871

of 22,884,315 outputs

Outputs from BMC Bioinformatics

#5,385

of 7,298 outputs

Outputs of similar age

#215,000

of 336,882 outputs

Outputs of similar age from BMC Bioinformatics

#84

of 136 outputs

Altmetric has tracked 22,884,315 research outputs across all sources so far. This one is in the 22nd percentile – i.e., 22% of other outputs scored the same or lower than it.

So far Altmetric has tracked 7,298 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 18th percentile – i.e., 18% of its peers scored the same or lower than it.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 336,882 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 27th percentile – i.e., 27% of its contemporaries scored the same or lower than it.

We're also able to compare this research output to 136 others from the same source and published within six weeks on either side of this one. This one is in the 35th percentile – i.e., 35% of its contemporaries scored the same or lower than it.

Fusing literature and full network data improves disease similarity computation

About this Attention Score

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context