Title |
Distributed gene expression modelling for exploring variability in epigenetic function
|
---|---|
Published in |
BMC Bioinformatics, November 2016
|
DOI | 10.1186/s12859-016-1313-1 |
Pubmed ID | |
Authors |
David M. Budden, Edmund J. Crampin |
Abstract |
Predictive gene expression modelling is an important tool in computational biology due to the volume of high-throughput sequencing data generated by recent consortia. However, the scope of previous studies has been restricted to a small set of cell-lines or experimental conditions due an inability to leverage distributed processing architectures for large, sharded data-sets. We present a distributed implementation of gene expression modelling using the MapReduce paradigm and prove that performance improves as a linear function of available processor cores. We then leverage the computational efficiency of this framework to explore the variability of epigenetic function across fifty histone modification data-sets from variety of cancerous and non-cancerous cell-lines. We demonstrate that the genome-wide relationships between histone modifications and mRNA transcription are lineage, tissue and karyotype-invariant, and that models trained on matched -omics data from non-cancerous cell-lines are able to predict cancerous expression with equivalent genome-wide fidelity. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United Kingdom | 1 | 11% |
Italy | 1 | 11% |
France | 1 | 11% |
Unknown | 6 | 67% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 6 | 67% |
Science communicators (journalists, bloggers, editors) | 1 | 11% |
Scientists | 1 | 11% |
Practitioners (doctors, other healthcare professionals) | 1 | 11% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 14 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 3 | 21% |
Student > Bachelor | 2 | 14% |
Student > Postgraduate | 2 | 14% |
Student > Ph. D. Student | 2 | 14% |
Other | 1 | 7% |
Other | 2 | 14% |
Unknown | 2 | 14% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 8 | 57% |
Nursing and Health Professions | 1 | 7% |
Computer Science | 1 | 7% |
Medicine and Dentistry | 1 | 7% |
Unknown | 3 | 21% |