Report for: Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions
Published in	BMC Biology, March 2017
DOI	10.1186/s12915-017-0366-6
Pubmed ID	28356154
Authors	Marion Ballenghien, Nicolas Faivre, Nicolas Galtier
Abstract	Contamination is a well-known but often neglected problem in molecular biology. Here, we investigated the prevalence of cross-contamination among 446 samples from 116 distinct species of animals, which were processed in the same laboratory and subjected to subcontracted transcriptome sequencing. Using cytochrome oxidase 1 as a barcode, we identified a minimum of 782 events of between-species contamination, with approximately 80% of our samples being affected. An analysis of laboratory metadata revealed a strong effect of the sequencing center: nearly all the detected events of between-species contamination involved species that were sent the same day to the same company. We introduce new methods to address the amount of within-species, between-individual contamination, and to correct for this problem when calling genotypes from base read counts. We report evidence for pervasive within-species contamination in this data set, and show that classical population genomic statistics, such as synonymous diversity, the ratio of non-synonymous to synonymous diversity, inbreeding coefficient FIT, and Tajima's D, are sensitive to this problem to various extents. Control analyses suggest that our published results are probably robust to the problem of contamination. Recommendations on how to prevent or avoid contamination in large-scale population genomics/molecular ecology are provided based on this analysis.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 132 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United States	30	23%
United Kingdom	11	8%
France	9	7%
Canada	6	5%
Germany	5	4%
Sweden	3	2%
Czechia	2	2%
Ecuador	2	2%
Ireland	2	2%
Other	17	13%
Unknown	45	34%

Demographic breakdown

Type	Count	As %
Scientists	75	57%
Members of the public	54	41%
Science communicators (journalists, bloggers, editors)	3	2%

Mendeley readers

The data shown below were compiled from readership statistics for 125 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
United Kingdom	2	2%
United States	2	2%
Netherlands	1	<1%
Australia	1	<1%
Unknown	119	95%

Demographic breakdown

Readers by professional status	Count	As %
Researcher	29	23%
Student > Ph. D. Student	22	18%
Student > Bachelor	12	10%
Student > Master	12	10%
Other	8	6%
Other	21	17%
Unknown	21	17%

Readers by discipline	Count	As %
Agricultural and Biological Sciences	53	42%
Biochemistry, Genetics and Molecular Biology	27	22%
Environmental Science	9	7%
Computer Science	3	2%
Medicine and Dentistry	2	2%
Other	7	6%
Unknown	24	19%

Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown