↓ Skip to main content

Classification of breast cancer patients using somatic mutation profiles and machine learning approaches

Overview of attention for article published in BMC Systems Biology, August 2016
Altmetric Badge

About this Attention Score

  • Good Attention Score compared to outputs of the same age (68th percentile)
  • Good Attention Score compared to outputs of the same age and source (71st percentile)

Mentioned by

twitter
2 X users
patent
1 patent

Citations

dimensions_citation
62 Dimensions

Readers on

mendeley
123 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Classification of breast cancer patients using somatic mutation profiles and machine learning approaches
Published in
BMC Systems Biology, August 2016
DOI 10.1186/s12918-016-0306-z
Pubmed ID
Authors

Suleyman Vural, Xiaosheng Wang, Chittibabu Guda

Abstract

The high degree of heterogeneity observed in breast cancers makes it very difficult to classify the cancer patients into distinct clinical subgroups and consequently limits the ability to devise effective therapeutic strategies. Several classification strategies based on ER/PR/HER2 expression or the expression profiles of a panel of genes have helped, but such methods often produce misleading results due to their dynamic nature. In contrast, somatic DNA mutations are relatively stable and lead to initiation and progression of many sporadic cancers. Hence in this study, we explore the use of gene mutation profiles to classify, characterize and predict the subgroups of breast cancers. We analyzed the whole exome sequencing data from 358 ethnically similar breast cancer patients in The Cancer Genome Atlas (TCGA) project. Somatic and non-synonymous single nucleotide variants identified from each patient were assigned a quantitative score (C-score) that represents the extent of negative impact on the gene function. Using these scores with non-negative matrix factorization method, we clustered the patients into three subgroups. By comparing the clinical stage of patients, we identified an early-stage-enriched and a late-stage-enriched subgroup. Comparison of the mutation scores of early and late-stage-enriched subgroups identified 358 genes that carry significantly higher mutations rates in the late stage subgroup. Functional characterization of these genes revealed important functional gene families that carry a heavy mutational load in the late state rich subgroup of patients. Finally, using the identified subgroups, we also developed a supervised classification model to predict the stage of the patients. This study demonstrates that gene mutation profiles can be effectively used with unsupervised machine-learning methods to identify clinically distinguishable breast cancer subgroups. The classification model developed in this method could provide a reasonable prediction of the cancer patients' stage solely based on their mutation profiles. This study represents the first use of only somatic mutation profile data to identify and predict breast cancer subgroups and this generic methodology can also be applied to other cancer datasets.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 123 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Germany 1 <1%
Canada 1 <1%
Unknown 121 98%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 29 24%
Student > Master 17 14%
Researcher 15 12%
Student > Bachelor 11 9%
Student > Doctoral Student 8 7%
Other 21 17%
Unknown 22 18%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 25 20%
Computer Science 24 20%
Agricultural and Biological Sciences 16 13%
Medicine and Dentistry 11 9%
Mathematics 3 2%
Other 15 12%
Unknown 29 24%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 4. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 21 June 2018.
All research outputs
#6,816,746
of 22,886,568 outputs
Outputs from BMC Systems Biology
#252
of 1,142 outputs
Outputs of similar age
#105,981
of 338,621 outputs
Outputs of similar age from BMC Systems Biology
#9
of 32 outputs
Altmetric has tracked 22,886,568 research outputs across all sources so far. This one has received more attention than most of these and is in the 69th percentile.
So far Altmetric has tracked 1,142 research outputs from this source. They receive a mean Attention Score of 3.6. This one has done well, scoring higher than 76% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 338,621 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 68% of its contemporaries.
We're also able to compare this research output to 32 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 71% of its contemporaries.