Universal sequence map (USM) of arbitrary discrete sequences

Overview of attention for article published in BMC Bioinformatics, February 2002

Altmetric Badge

Readers on

mendeley: 44 Mendeley
citeulike: 1 CiteULike

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	Universal sequence map (USM) of arbitrary discrete sequences
Published in	BMC Bioinformatics, February 2002
DOI	10.1186/1471-2105-3-6
Pubmed ID	11895567
Authors	Jonas S Almeida, Susana Vinga
Abstract	For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis--without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. We have successfully identified such an iterative function for bijective mapping psi of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM), is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR). The latter enables the representation of 4 unit type sequences (like DNA) as an order free Markov chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules.

View on publisher site Alert me about new mentions

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 44 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
United States	3	7%
Denmark	1	2%
Italy	1	2%
Unknown	39	89%

Demographic breakdown

Readers by professional status	Count	As %
Student > Ph. D. Student	13	30%
Researcher	11	25%
Professor > Associate Professor	3	7%
Student > Bachelor	2	5%
Professor	2	5%
Other	6	14%
Unknown	7	16%

Readers by discipline	Count	As %
Agricultural and Biological Sciences	13	30%
Computer Science	12	27%
Biochemistry, Genetics and Molecular Biology	4	9%
Medicine and Dentistry	3	7%
Nursing and Health Professions	1	2%
Other	4	9%
Unknown	7	16%