↓ Skip to main content

Machine learning approach for pooled DNA sample calibration

Overview of attention for article published in BMC Bioinformatics, July 2015
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age
  • Average Attention Score compared to outputs of the same age and source

Mentioned by

twitter
4 X users
facebook
1 Facebook page

Citations

dimensions_citation
3 Dimensions

Readers on

mendeley
26 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Machine learning approach for pooled DNA sample calibration
Published in
BMC Bioinformatics, July 2015
DOI 10.1186/s12859-015-0593-1
Pubmed ID
Authors

Andrew D Hellicar, Ashfaqur Rahman, Daniel V Smith, John M Henshall

Abstract

Despite ongoing reduction in genotyping costs, genomic studies involving large numbers of species with low economic value (such as Black Tiger prawns) remain cost prohibitive. In this scenario DNA pooling is an attractive option to reduce genotyping costs. However, genotyping of pooled samples comprising DNA from many individuals is challenging due to the presence of errors that exceed the allele frequency quantisation size and therefore cannot be simply corrected by clustering techniques. The solution to the calibration problem is a correction to the allele frequency to mitigate errors incurred in the measurement process. We highlight the limitations of the existing calibration solutions such as the fact they impose assumptions on the variation between allele frequencies 0, 0.5, and 1.0, and address a limited set of error types. We propose a novel machine learning method to address the limitations identified. The approach is tested on SNPs genotyped with the Sequenom iPLEX platform and compared to existing state of the art calibration methods. The new method is capable of reducing the mean square error in allele frequency to half that achievable with existing approaches. Furthermore for the first time we demonstrate the importance of carefully considering the choice of training data when using calibration approaches built from pooled data. This paper demonstrates that improvements in pooled allele frequency estimates result if the genotyping platform is characterised at allele frequencies other than the homozygous and heterozygous cases. Techniques capable of incorporating such information are described along with aspects of implementation.

X Demographics

X Demographics

The data shown below were collected from the profiles of 4 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 26 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Ireland 1 4%
Unknown 25 96%

Demographic breakdown

Readers by professional status Count As %
Researcher 7 27%
Student > Bachelor 5 19%
Student > Ph. D. Student 4 15%
Student > Master 2 8%
Professor > Associate Professor 1 4%
Other 1 4%
Unknown 6 23%
Readers by discipline Count As %
Agricultural and Biological Sciences 7 27%
Computer Science 4 15%
Engineering 3 12%
Biochemistry, Genetics and Molecular Biology 2 8%
Immunology and Microbiology 1 4%
Other 2 8%
Unknown 7 27%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 27 July 2015.
All research outputs
#14,231,577
of 22,816,807 outputs
Outputs from BMC Bioinformatics
#4,724
of 7,284 outputs
Outputs of similar age
#134,996
of 262,224 outputs
Outputs of similar age from BMC Bioinformatics
#76
of 113 outputs
Altmetric has tracked 22,816,807 research outputs across all sources so far. This one is in the 35th percentile – i.e., 35% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,284 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 31st percentile – i.e., 31% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 262,224 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 45th percentile – i.e., 45% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 113 others from the same source and published within six weeks on either side of this one. This one is in the 31st percentile – i.e., 31% of its contemporaries scored the same or lower than it.