↓ Skip to main content

Semi-supervised adaptive-height snipping of the hierarchical clustering tree

Overview of attention for article published in BMC Bioinformatics, January 2015
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age
  • Average Attention Score compared to outputs of the same age and source

Mentioned by

twitter
5 X users

Citations

dimensions_citation
10 Dimensions

Readers on

mendeley
20 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Semi-supervised adaptive-height snipping of the hierarchical clustering tree
Published in
BMC Bioinformatics, January 2015
DOI 10.1186/s12859-014-0448-1
Pubmed ID
Authors

Askar Obulkasim, Gerrit A Meijer, Mark A van de Wiel

Abstract

BackgroundIn genomics, hierarchical clustering (HC) is a popular method for grouping similar samples based on a distance measure. HC algorithms do not actually create clusters, but compute a hierarchical representation of the data set. Usually, a fixed height on the HC tree is used, and each contiguous branch of samples below that height is considered a separate cluster. Due to the fixed-height cutting, those clusters may not unravel significant functional coherence hidden deeper in the tree. Besides that, most existing approaches do not make use of available clinical information to guide cluster extraction from the HC. Thus, the identified subgroups may be difficult to interpret in relation to that information.ResultsWe develop a novel framework for decomposing the HC tree into clusters by semi-supervised piecewise snipping. The framework, called guided piecewise snipping, utilizes both molecular data and clinical information to decompose the HC tree into clusters. It cuts the given HC tree at variable heights to find a partition (a set of non-overlapping clusters) which does not only represent a structure deemed to underlie the data from which HC tree is derived, but is also maximally consistent with the supplied clinical data. Moreover, the approach does not require the user to specify the number of clusters prior to the analysis. Extensive results on simulated and multiple medical data sets show that our approach consistently produces more meaningful clusters than the standard fixed-height cut and/or non-guided approaches.ConclusionsThe guided piecewise snipping approach features several novelties and advantages over existing approaches. The proposed algorithm is generic, and can be combined with other algorithms that operate on detected clusters. This approach represents an advancement in several regards: (1) a piecewise tree snipping framework that efficiently extracts clusters by snipping the HC tree possibly at variable heights while preserving the HC tree structure; (2) a flexible implementation allowing a variety of data types for both building and snipping the HC tree, including patient follow-up data like survival as auxiliary information.The data sets and R code are provided as supplementary files. The proposed method is available from Bioconductor as the R-package HCsnip.

X Demographics

X Demographics

The data shown below were collected from the profiles of 5 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 20 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 1 5%
Unknown 19 95%

Demographic breakdown

Readers by professional status Count As %
Student > Bachelor 4 20%
Researcher 4 20%
Student > Ph. D. Student 3 15%
Student > Master 3 15%
Professor 2 10%
Other 2 10%
Unknown 2 10%
Readers by discipline Count As %
Computer Science 5 25%
Agricultural and Biological Sciences 3 15%
Psychology 2 10%
Biochemistry, Genetics and Molecular Biology 1 5%
Economics, Econometrics and Finance 1 5%
Other 5 25%
Unknown 3 15%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 21 August 2015.
All research outputs
#13,927,627
of 22,778,347 outputs
Outputs from BMC Bioinformatics
#4,468
of 7,276 outputs
Outputs of similar age
#181,437
of 352,360 outputs
Outputs of similar age from BMC Bioinformatics
#71
of 146 outputs
Altmetric has tracked 22,778,347 research outputs across all sources so far. This one is in the 37th percentile – i.e., 37% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,276 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 35th percentile – i.e., 35% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 352,360 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 47th percentile – i.e., 47% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 146 others from the same source and published within six weeks on either side of this one. This one is in the 48th percentile – i.e., 48% of its contemporaries scored the same or lower than it.