↓ Skip to main content

Subtype identification from heterogeneous TCGA datasets on a genomic scale by multi-view clustering with enhanced consensus

Overview of attention for article published in BMC Medical Genomics, December 2017
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age and source

Mentioned by

twitter
2 X users

Citations

dimensions_citation
17 Dimensions

Readers on

mendeley
31 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Subtype identification from heterogeneous TCGA datasets on a genomic scale by multi-view clustering with enhanced consensus
Published in
BMC Medical Genomics, December 2017
DOI 10.1186/s12920-017-0306-x
Pubmed ID
Authors

Menglan Cai, Limin Li

Abstract

The Cancer Genome Atlas (TCGA) has collected transcriptome, genome and epigenome information for over 20 cancers from thousands of patients. The availability of these diverse data types makes it necessary to combine these data to capture the heterogeneity of biological processes and phenotypes and further identify homogeneous subtypes for cancers such as breast cancer. Many multi-view clustering approaches are proposed to discover clusters across different data types. The problem is challenging when different data types show poor agreement of clustering structure. In this work, we first propose a multi-view clustering approach with consensus (CMC), which tries to find consensus kernels among views by using Hilbert Schmidt Independence Criterion. To tackle the problem when poor agreement among views exists, we further propose a multi-view clustering approach with enhanced consensus (ECMC) to solve this problem by decomposing the kernel information in each view into a consensus part and a disagreement part. The consensus parts for different views are supposed to be similar, and the disagreement parts should be independent with the consensus parts. Both the CMC and ECMC models can be solved by alternative updating with semi-definite programming. Our experiments on both simulation datasets and real-world benchmark datasets show that ECMC model could achieve higher clustering accuracies than other state-of-art multi-view clustering approaches. We also apply the ECMC model to integrate mRNA expression, DNA methylation and microRNA (miRNA) expression data for five cancer data sets, and the survival analysis show that our ECMC model outperforms other methods when identifying cancer subtypes. By Fisher's combination test method, we found that three computed subtypes roughly correspond to three known breast cancer subtypes including luminal B, HER2 and basal-like subtypes. Integrating heterogeneous TCGA datasets by our proposed multi-view clustering approach ECMC could effectively identify cancer subtypes.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 31 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 31 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 6 19%
Student > Ph. D. Student 6 19%
Student > Master 3 10%
Student > Postgraduate 2 6%
Student > Bachelor 2 6%
Other 1 3%
Unknown 11 35%
Readers by discipline Count As %
Computer Science 5 16%
Medicine and Dentistry 2 6%
Biochemistry, Genetics and Molecular Biology 2 6%
Nursing and Health Professions 2 6%
Mathematics 1 3%
Other 5 16%
Unknown 14 45%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 13 January 2018.
All research outputs
#18,579,736
of 23,012,811 outputs
Outputs from BMC Medical Genomics
#867
of 1,232 outputs
Outputs of similar age
#328,891
of 440,666 outputs
Outputs of similar age from BMC Medical Genomics
#13
of 19 outputs
Altmetric has tracked 23,012,811 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.
So far Altmetric has tracked 1,232 research outputs from this source. They receive a mean Attention Score of 4.7. This one is in the 18th percentile – i.e., 18% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 440,666 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 14th percentile – i.e., 14% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 19 others from the same source and published within six weeks on either side of this one. This one is in the 31st percentile – i.e., 31% of its contemporaries scored the same or lower than it.