↓ Skip to main content

Identification and utilization of arbitrary correlations in models of recombination signal sequences

Overview of attention for article published in Genome Biology, November 2002
Altmetric Badge

Citations

dimensions_citation
55 Dimensions

Readers on

mendeley
47 Mendeley
citeulike
1 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Identification and utilization of arbitrary correlations in models of recombination signal sequences
Published in
Genome Biology, November 2002
DOI 10.1186/gb-2002-3-12-research0072
Pubmed ID
Authors

Lindsay G Cowell, Marco Davila, Thomas B Kepler, Garnett Kelsoe

Abstract

A significant challenge in bioinformatics is to develop methods for detecting and modeling patterns in variable DNA sequence sites, such as protein-binding sites in regulatory DNA. Current approaches sometimes perform poorly when positions in the site do not independently affect protein binding. We developed a statistical technique for modeling the correlation structure in variable DNA sequence sites. The method places no restrictions on the number of correlated positions or on their spatial relationship within the site. No prior empirical evidence for the correlation structure is necessary. We applied our method to the recombination signal sequences (RSS) that direct assembly of B-cell and T-cell antigen-receptor genes via V(D)J recombination. The technique is based on model selection by cross-validation and produces models that allow computation of an information score for any signal-length sequence. We also modeled RSS using order zero and order one Markov chains. The scores from all models are highly correlated with measured recombination efficiencies, but the models arising from our technique are better than the Markov models at discriminating RSS from non-RSS. Our model-development procedure produces models that estimate well the recombinogenic potential of RSS and are better at RSS recognition than the order zero and order one Markov models. Our models are, therefore, valuable for studying the regulation of both physiologic and aberrant V(D)J recombination. The approach could be equally powerful for the study of promoter and enhancer elements, splice sites, and other DNA regulatory sites that are highly variable at the level of individual nucleotide positions.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 47 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 47 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 12 26%
Researcher 8 17%
Professor 7 15%
Student > Master 4 9%
Unspecified 2 4%
Other 6 13%
Unknown 8 17%
Readers by discipline Count As %
Agricultural and Biological Sciences 24 51%
Biochemistry, Genetics and Molecular Biology 7 15%
Unspecified 2 4%
Engineering 2 4%
Immunology and Microbiology 1 2%
Other 3 6%
Unknown 8 17%