↓ Skip to main content

Distinguishing potential bacteria-tumor associations from contamination in a secondary data analysis of public cancer genome sequence data

Overview of attention for article published in Microbiome, January 2017
Altmetric Badge

About this Attention Score

  • In the top 5% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (94th percentile)
  • Good Attention Score compared to outputs of the same age and source (72nd percentile)

Mentioned by

2 blogs
33 X users
2 Facebook pages
1 Wikipedia page
2 Google+ users


63 Dimensions

Readers on

140 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Distinguishing potential bacteria-tumor associations from contamination in a secondary data analysis of public cancer genome sequence data
Published in
Microbiome, January 2017
DOI 10.1186/s40168-016-0224-8
Pubmed ID

Kelly M. Robinson, Jonathan Crabtree, John S. A. Mattick, Kathleen E. Anderson, Julie C. Dunning Hotopp


A variety of bacteria are known to influence carcinogenesis. Therefore, we sought to investigate if publicly available whole genome and whole transcriptome sequencing data generated by large public cancer genome efforts, like The Cancer Genome Atlas (TCGA), could be used to identify bacteria associated with cancer. The Burrows-Wheeler aligner (BWA) was used to align a subset of Illumina paired-end sequencing data from TCGA to the human reference genome and all complete bacterial genomes in the RefSeq database in an effort to identify bacterial read pairs from the microbiome. Through careful consideration of all of the bacterial taxa present in the cancer types investigated, their relative abundance, and batch effects, we were able to identify some read pairs from certain taxa as likely resulting from contamination. In particular, the presence of Mycobacterium tuberculosis complex in the ovarian serous cystadenocarcinoma (OV) and glioblastoma multiforme (GBM) samples was correlated with the sequencing center of the samples. Additionally, there was a correlation between the presence of Ralstonia spp. and two specific plates of acute myeloid leukemia (AML) samples. At the end, associations remained between Pseudomonas-like and Acinetobacter-like read pairs in AML, and Pseudomonas-like read pairs in stomach adenocarcinoma (STAD) that could not be explained through batch effects or systematic contamination as seen in other samples. This approach suggests that it is possible to identify bacteria that may be present in human tumor samples from public genome sequencing data that can be examined further experimentally. More weight should be given to this approach in the future when bacterial associations with diseases are suspected.

X Demographics

X Demographics

The data shown below were collected from the profiles of 33 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 140 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 1 <1%
Unknown 139 99%

Demographic breakdown

Readers by professional status Count As %
Researcher 28 20%
Student > Ph. D. Student 27 19%
Student > Master 16 11%
Student > Bachelor 8 6%
Other 7 5%
Other 16 11%
Unknown 38 27%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 33 24%
Agricultural and Biological Sciences 26 19%
Medicine and Dentistry 15 11%
Immunology and Microbiology 7 5%
Chemistry 3 2%
Other 10 7%
Unknown 46 33%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 38. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 13 September 2022.
All research outputs
of 24,885,505 outputs
Outputs from Microbiome
of 1,705 outputs
Outputs of similar age
of 429,335 outputs
Outputs of similar age from Microbiome
of 37 outputs
Altmetric has tracked 24,885,505 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 95th percentile: it's in the top 5% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 1,705 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 38.5. This one has done well, scoring higher than 82% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 429,335 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 94% of its contemporaries.
We're also able to compare this research output to 37 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 72% of its contemporaries.