↓ Skip to main content

Prediction of gene expression with cis-SNPs using mixed models and regularization methods

Overview of attention for article published in BMC Genomics, May 2017
Altmetric Badge

Citations

dimensions_citation
29 Dimensions

Readers on

mendeley
52 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Prediction of gene expression with cis-SNPs using mixed models and regularization methods
Published in
BMC Genomics, May 2017
DOI 10.1186/s12864-017-3759-6
Pubmed ID
Authors

Ping Zeng, Xiang Zhou, Shuiping Huang

Abstract

It has been shown that gene expression in human tissues is heritable, thus predicting gene expression using only SNPs becomes possible. The prediction of gene expression can offer important implications on the genetic architecture of individual functional associated SNPs and further interpretations of the molecular basis underlying human diseases. We compared three types of methods for predicting gene expression using only cis-SNPs, including the polygenic model, i.e. linear mixed model (LMM), two sparse models, i.e. Lasso and elastic net (ENET), and the hybrid of LMM and sparse model, i.e. Bayesian sparse linear mixed model (BSLMM). The three kinds of prediction methods have very different assumptions of underlying genetic architectures. These methods were evaluated using simulations under various scenarios, and were applied to the Geuvadis gene expression data. The simulations showed that these four prediction methods (i.e. Lasso, ENET, LMM and BSLMM) behaved best when their respective modeling assumptions were satisfied, but BSLMM had a robust performance across a range of scenarios. According to R (2) of these models in the Geuvadis data, the four methods performed quite similarly. We did not observe any clustering or enrichment of predictive genes (defined as genes with R (2) ≥ 0.05) across the chromosomes, and also did not see there was any clear relationship between the proportion of the predictive genes and the proportion of genes in each chromosome. However, an interesting finding in the Geuvadis data was that highly predictive genes (e.g. R (2) ≥ 0.30) may have sparse genetic architectures since Lasso, ENET and BSLMM outperformed LMM for these genes; and this observation was validated in another gene expression data. We further showed that the predictive genes were enriched in approximately independent LD blocks. Gene expression can be predicted with only cis-SNPs using well-developed prediction models and these predictive genes were enriched in some approximately independent LD blocks. The prediction of gene expression can shed some light on the functional interpretation for identified SNPs in GWASs.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 52 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 52 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 12 23%
Student > Master 6 12%
Researcher 5 10%
Student > Doctoral Student 4 8%
Other 4 8%
Other 7 13%
Unknown 14 27%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 15 29%
Agricultural and Biological Sciences 9 17%
Computer Science 3 6%
Unspecified 2 4%
Mathematics 2 4%
Other 4 8%
Unknown 17 33%