Report for: SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing
Published in	BMC Genomics, November 2016
DOI	10.1186/s12864-016-3281-2
Pubmed ID	27842494
Authors	Jean-François Spinella, Pamela Mehanna, Ramon Vidal, Virginie Saillour, Pauline Cassart, Chantal Richer, Manon Ouimet, Jasmine Healy, Daniel Sinnett
Abstract	Next-generation sequencing (NGS) allows unbiased, in-depth interrogation of cancer genomes. Many somatic variant callers have been developed yet accurate ascertainment of somatic variants remains a considerable challenge as evidenced by the varying mutation call rates and low concordance among callers. Statistical model-based algorithms that are currently available perform well under ideal scenarios, such as high sequencing depth, homogeneous tumor samples, high somatic variant allele frequency (VAF), but show limited performance with sub-optimal data such as low-pass whole-exome/genome sequencing data. While the goal of any cancer sequencing project is to identify a relevant, and limited, set of somatic variants for further sequence/functional validation, the inherently complex nature of cancer genomes combined with technical issues directly related to sequencing and alignment can affect either the specificity and/or sensitivity of most callers. For these reasons, we developed SNooPer, a versatile machine learning approach that uses Random Forest classification models to accurately call somatic variants in low-depth sequencing data. SNooPer uses a subset of variant positions from the sequencing output for which the class, true variation or sequencing error, is known to train the data-specific model. Here, using a real dataset of 40 childhood acute lymphoblastic leukemia patients, we show how the SNooPer algorithm is not affected by low coverage or low VAFs, and can be used to reduce overall sequencing costs while maintaining high specificity and sensitivity to somatic variant calling. When compared to three benchmarked somatic callers, SNooPer demonstrated the best overall performance. While the goal of any cancer sequencing project is to identify a relevant, and limited, set of somatic variants for further sequence/functional validation, the inherently complex nature of cancer genomes combined with technical issues directly related to sequencing and alignment can affect either the specificity and/or sensitivity of most callers. The flexibility of SNooPer's random forest protects against technical bias and systematic errors, and is appealing in that it does not rely on user-defined parameters. The code and user guide can be downloaded at https://sourceforge.net/projects/snooper/ .

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 5 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
Iran, Islamic Republic of	1	20%
Unknown	4	80%

Demographic breakdown

Type	Count	As %
Members of the public	3	60%
Scientists	2	40%

Mendeley readers

The data shown below were compiled from readership statistics for 105 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
Netherlands	1	<1%
France	1	<1%
Unknown	103	98%

Demographic breakdown

Readers by professional status	Count	As %
Researcher	19	18%
Student > Master	17	16%
Student > Ph. D. Student	16	15%
Student > Bachelor	12	11%
Student > Doctoral Student	9	9%
Other	15	14%
Unknown	17	16%

Readers by discipline	Count	As %
Biochemistry, Genetics and Molecular Biology	28	27%
Agricultural and Biological Sciences	20	19%
Computer Science	12	11%
Engineering	5	5%
Medicine and Dentistry	4	4%
Other	12	11%
Unknown	24	23%

Attention Score in Context

This research output has an Altmetric Attention Score of 6. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 27 June 2019.

All research outputs

#5,441,264

of 22,901,818 outputs

Outputs from BMC Genomics

#2,152

of 10,674 outputs

Outputs of similar age

#80,996

of 307,484 outputs

Outputs of similar age from BMC Genomics

#46

of 225 outputs

Altmetric has tracked 22,901,818 research outputs across all sources so far. Compared to these this one has done well and is in the 76th percentile: it's in the top 25% of all research outputs ever tracked by Altmetric.

So far Altmetric has tracked 10,674 research outputs from this source. They receive a mean Attention Score of 4.7. This one has done well, scoring higher than 79% of its peers.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 307,484 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 73% of its contemporaries.

We're also able to compare this research output to 225 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 79% of its contemporaries.

SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing

About this Attention Score

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context