↓ Skip to main content

PRS-on-Spark (PRSoS): a novel, efficient and flexible approach for generating polygenic risk scores

Overview of attention for article published in BMC Bioinformatics, August 2018
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
24 Dimensions

Readers on

mendeley
79 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
PRS-on-Spark (PRSoS): a novel, efficient and flexible approach for generating polygenic risk scores
Published in
BMC Bioinformatics, August 2018
DOI 10.1186/s12859-018-2289-9
Pubmed ID
Authors

Lawrence M. Chen, Nelson Yao, Elika Garg, Yuecai Zhu, Thao T. T. Nguyen, Irina Pokhvisneva, Shantala A. Hari Dass, Eva Unternaehrer, Hélène Gaudreau, Marie Forest, Lisa M. McEwen, Julia L. MacIsaac, Michael S. Kobor, Celia M. T. Greenwood, Patricia P. Silveira, Michael J. Meaney, Kieran J. O’Donnell

Abstract

Polygenic risk scores (PRS) describe the genomic contribution to complex phenotypes and consistently account for a larger proportion of variance in outcome than single nucleotide polymorphisms (SNPs) alone. However, there is little consensus on the optimal data input for generating PRS, and existing approaches largely preclude the use of imputed posterior probabilities and strand-ambiguous SNPs i.e., A/T or C/G polymorphisms. Our ability to predict complex traits that arise from the additive effects of a large number of SNPs would likely benefit from a more inclusive approach. We developed PRS-on-Spark (PRSoS), a software implemented in Apache Spark and Python that accommodates different data inputs and strand-ambiguous SNPs to calculate PRS. We compared performance between PRSoS and an existing software (PRSice v1.25) for generating PRS for major depressive disorder using a community cohort (N = 264). We found PRSoS to perform faster than PRSice v1.25 when PRS were generated for a large number of SNPs (~ 17 million SNPs; t = 42.865, p = 5.43E-04). We also show that the use of imputed posterior probabilities and the inclusion of strand-ambiguous SNPs increase the proportion of variance explained by a PRS for major depressive disorder (from 4.3% to 4.8%). PRSoS provides the user with the ability to generate PRS using an inclusive and efficient approach that considers a larger number of SNPs than conventional approaches. We show that a PRS for major depressive disorder that includes strand-ambiguous SNPs, calculated using PRSoS, accounts for the largest proportion of variance in symptoms of depression in a community cohort, demonstrating the utility of this approach. The availability of this software will help users develop more informative PRS for a variety of complex phenotypes.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 79 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 79 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 19 24%
Researcher 14 18%
Student > Master 10 13%
Student > Bachelor 7 9%
Other 4 5%
Other 10 13%
Unknown 15 19%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 17 22%
Agricultural and Biological Sciences 8 10%
Computer Science 6 8%
Neuroscience 6 8%
Medicine and Dentistry 6 8%
Other 19 24%
Unknown 17 22%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 23 April 2019.
All research outputs
#18,646,262
of 23,099,576 outputs
Outputs from BMC Bioinformatics
#6,365
of 7,329 outputs
Outputs of similar age
#254,636
of 331,157 outputs
Outputs of similar age from BMC Bioinformatics
#79
of 98 outputs
Altmetric has tracked 23,099,576 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,329 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 5th percentile – i.e., 5% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 331,157 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 12th percentile – i.e., 12% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 98 others from the same source and published within six weeks on either side of this one. This one is in the 12th percentile – i.e., 12% of its contemporaries scored the same or lower than it.