↓ Skip to main content

Model-based clustering with certainty estimation: implication for clade assignment of influenza viruses

Overview of attention for article published in BMC Bioinformatics, July 2016
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age
  • Average Attention Score compared to outputs of the same age and source

Mentioned by

twitter
5 X users
facebook
1 Facebook page

Citations

dimensions_citation
1 Dimensions

Readers on

mendeley
10 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Model-based clustering with certainty estimation: implication for clade assignment of influenza viruses
Published in
BMC Bioinformatics, July 2016
DOI 10.1186/s12859-016-1147-x
Pubmed ID
Authors

Shunpu Zhang, Zhong Li, Kevin Beland, Guoqing Lu

Abstract

Clustering is a common technique used by molecular biologists to group homologous sequences and study evolution. There remain issues such as how to cluster molecular sequences accurately and in particular how to evaluate the certainty of clustering results. We presented a model-based clustering method to analyze molecular sequences, described a subset bootstrap scheme to evaluate a certainty of the clusters, and showed an intuitive way using 3D visualization to examine clusters. We applied the above approach to analyze influenza viral hemagglutinin (HA) sequences. Nine clusters were estimated for high pathogenic H5N1 avian influenza, which agree with previous findings. The certainty for a given sequence that can be correctly assigned to a cluster was all 1.0 whereas the certainty for a given cluster was also very high (0.92-1.0), with an overall clustering certainty of 0.95. For influenza A H7 viruses, ten HA clusters were estimated and the vast majority of sequences could be assigned to a cluster with a certainty of more than 0.99. The certainties for clusters, however, varied from 0.40 to 0.98; such certainty variation is likely attributed to the heterogeneity of sequence data in different clusters. In both cases, the certainty values estimated using the subset bootstrap method are all higher than those calculated based upon the standard bootstrap method, suggesting our bootstrap scheme is applicable for the estimation of clustering certainty. We formulated a clustering analysis approach with the estimation of certainties and 3D visualization of sequence data. We analysed 2 sets of influenza A HA sequences and the results indicate our approach was applicable for clustering analysis of influenza viral sequences.

X Demographics

X Demographics

The data shown below were collected from the profiles of 5 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 10 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Brazil 1 10%
Unknown 9 90%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 5 50%
Student > Bachelor 1 10%
Researcher 1 10%
Student > Postgraduate 1 10%
Unknown 2 20%
Readers by discipline Count As %
Agricultural and Biological Sciences 4 40%
Computer Science 2 20%
Mathematics 1 10%
Environmental Science 1 10%
Medicine and Dentistry 1 10%
Other 0 0%
Unknown 1 10%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 3. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 26 July 2016.
All research outputs
#13,240,863
of 22,881,154 outputs
Outputs from BMC Bioinformatics
#4,011
of 7,298 outputs
Outputs of similar age
#193,348
of 364,404 outputs
Outputs of similar age from BMC Bioinformatics
#51
of 108 outputs
Altmetric has tracked 22,881,154 research outputs across all sources so far. This one is in the 41st percentile – i.e., 41% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,298 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 42nd percentile – i.e., 42% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 364,404 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 46th percentile – i.e., 46% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 108 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 50% of its contemporaries.