↓ Skip to main content

Efficient representation of uncertainty in multiple sequence alignments using directed acyclic graphs

Overview of attention for article published in BMC Bioinformatics, April 2015
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age

Mentioned by

twitter
1 tweeter
f1000
1 research highlight platform

Citations

dimensions_citation
14 Dimensions

Readers on

mendeley
35 Mendeley
citeulike
1 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Efficient representation of uncertainty in multiple sequence alignments using directed acyclic graphs
Published in
BMC Bioinformatics, April 2015
DOI 10.1186/s12859-015-0516-1
Pubmed ID
Authors

Joseph L Herman, Ádám Novák, Rune Lyngsø, Adrienn Szabó, István Miklós, Jotun Hein

Abstract

A standard procedure in many areas of bioinformatics is to use a single multiple sequence alignment (MSA) as the basis for various types of analysis. However, downstream results may be highly sensitive to the alignment used, and neglecting the uncertainty in the alignment can lead to significant bias in the resulting inference. In recent years, a number of approaches have been developed for probabilistic sampling of alignments, rather than simply generating a single optimum. However, this type of probabilistic information is currently not widely used in the context of downstream inference, since most existing algorithms are set up to make use of a single alignment. In this work we present a framework for representing a set of sampled alignments as a directed acyclic graph (DAG) whose nodes are alignment columns; each path through this DAG then represents a valid alignment. Since the probabilities of individual columns can be estimated from empirical frequencies, this approach enables sample-based estimation of posterior alignment probabilities. Moreover, due to conditional independencies between columns, the graph structure encodes a much larger set of alignments than the original set of sampled MSAs, such that the effective sample size is greatly increased. The alignment DAG provides a natural way to represent a distribution in the space of MSAs, and allows for existing algorithms to be efficiently scaled up to operate on large sets of alignments. As an example, we show how this can be used to compute marginal probabilities for tree topologies, averaging over a very large number of MSAs. This framework can also be used to generate a statistically meaningful summary alignment; example applications show that this summary alignment is consistently more accurate than the majority of the alignment samples, leading to improvements in downstream tree inference. Implementations of the methods described in this article are available at http://statalign.github.io/WeaveAlign .

Twitter Demographics

The data shown below were collected from the profile of 1 tweeter who shared this research output. Click here to find out more about how the information was compiled.

Mendeley readers

The data shown below were compiled from readership statistics for 35 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Hungary 1 3%
United States 1 3%
Unknown 33 94%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 15 43%
Student > Doctoral Student 4 11%
Student > Bachelor 4 11%
Researcher 3 9%
Student > Master 2 6%
Other 3 9%
Unknown 4 11%
Readers by discipline Count As %
Computer Science 10 29%
Biochemistry, Genetics and Molecular Biology 8 23%
Agricultural and Biological Sciences 8 23%
Medicine and Dentistry 2 6%
Social Sciences 2 6%
Other 2 6%
Unknown 3 9%

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 31 January 2017.
All research outputs
#4,603,790
of 8,982,225 outputs
Outputs from BMC Bioinformatics
#2,487
of 3,859 outputs
Outputs of similar age
#112,153
of 221,874 outputs
Outputs of similar age from BMC Bioinformatics
#93
of 116 outputs
Altmetric has tracked 8,982,225 research outputs across all sources so far. This one is in the 46th percentile – i.e., 46% of other outputs scored the same or lower than it.
So far Altmetric has tracked 3,859 research outputs from this source. They receive a mean Attention Score of 5.0. This one is in the 31st percentile – i.e., 31% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 221,874 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 45th percentile – i.e., 45% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 116 others from the same source and published within six weeks on either side of this one. This one is in the 18th percentile – i.e., 18% of its contemporaries scored the same or lower than it.