↓ Skip to main content

A general integrative genomic feature transcription factor binding site prediction method applied to analysis of USF1 binding in cardiovascular disease

Overview of attention for article published in Human Genomics, April 2009
Altmetric Badge

Readers on

mendeley
19 Mendeley
citeulike
2 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
A general integrative genomic feature transcription factor binding site prediction method applied to analysis of USF1 binding in cardiovascular disease
Published in
Human Genomics, April 2009
DOI 10.1186/1479-7364-3-3-221
Pubmed ID
Authors

Tianyuan Wang, Terrence S Furey, Jessica J Connelly, Shihao Ji, Sarah Nelson, Steffen Heber, Simon G Gregory, Elizabeth R Hauser

Abstract

Transcription factors are key mediators of human complex disease processes. Identifying the target genes of transcription factors will increase our understanding of the biological network leading to disease risk. The prediction of transcription factor binding sites (TFBSs) is one method to identify these target genes; however, current prediction methods need improvement. We chose the transcription factor upstream stimulatory factor 1 ( USF1 ) to evaluate the performance of our novel TFBS prediction method because of its known genetic association with coronary artery disease (CAD) and the recent availability of USF1 chromatin immunoprecipitation microarray (ChIP-chip) results. The specific goals of our study were to develop a novel and accurate genome-scale method for predicting USF1 binding sites and associated target genes to aid in the study of CAD. Previously published USF1 ChIP-chip data for 1 per cent of the genome were used to develop and evaluate several kernel logistic regression prediction models. A combination of genomic features (phylogenetic conservation, regulatory potential, presence of a CpG island and DNaseI hypersensitivity), as well as position weight matrix (PWM) scores, were used as variables for these models. Our most accurate predictor achieved an area under the receiver operator characteristic curve of 0.827 during cross-validation experiments, significantly outperforming standard PWM-based prediction methods. When applied to the whole human genome, we predicted 24,010 USF1 binding sites within 5 kilobases upstream of the transcription start site of 9,721 genes. These predictions included 16 of 20 genes with strong evidence of USF1 regulation. Finally, in the spirit of genomic convergence, we integrated independent experimental CAD data with these USF1 binding site prediction results to develop a prioritised set of candidate genes for future CAD studies. We have shown that our novel prediction method, which employs genomic features related to the presence of regulatory elements, enables more accurate and efficient prediction of USF1 binding sites. This method can be extended to other transcription factors identified in human disease studies to help further our understanding of the biology of complex disease.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 19 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 3 16%
Unknown 16 84%

Demographic breakdown

Readers by professional status Count As %
Student > Master 4 21%
Student > Ph. D. Student 4 21%
Professor 3 16%
Researcher 3 16%
Professor > Associate Professor 2 11%
Other 3 16%
Readers by discipline Count As %
Agricultural and Biological Sciences 6 32%
Medicine and Dentistry 5 26%
Biochemistry, Genetics and Molecular Biology 3 16%
Computer Science 3 16%
Neuroscience 1 5%
Other 0 0%
Unknown 1 5%