↓ Skip to main content

A comprehensive and scalable database search system for metaproteomics

Overview of attention for article published in BMC Genomics, August 2016
Altmetric Badge

About this Attention Score

  • Good Attention Score compared to outputs of the same age (70th percentile)
  • Good Attention Score compared to outputs of the same age and source (73rd percentile)

Mentioned by

twitter
10 tweeters

Citations

dimensions_citation
31 Dimensions

Readers on

mendeley
95 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
A comprehensive and scalable database search system for metaproteomics
Published in
BMC Genomics, August 2016
DOI 10.1186/s12864-016-2855-3
Pubmed ID
Authors

Sandip Chatterjee, Gregory S. Stupp, Sung Kyu Robin Park, Jean-Christophe Ducom, John R. Yates, Andrew I. Su, Dennis W. Wolan

Abstract

Mass spectrometry-based shotgun proteomics experiments rely on accurate matching of experimental spectra against a database of protein sequences. Existing computational analysis methods are limited in the size of their sequence databases, which severely restricts the proteomic sequencing depth and functional analysis of highly complex samples. The growing amount of public high-throughput sequencing data will only exacerbate this problem. We designed a broadly applicable metaproteomic analysis method (ComPIL) that addresses protein database size limitations. Our approach to overcome this significant limitation in metaproteomics was to design a scalable set of sequence databases assembled for optimal library querying speeds. ComPIL was integrated with a modified version of the search engine ProLuCID (termed "Blazmass") to permit rapid matching of experimental spectra. Proof-of-principle analysis of human HEK293 lysate with a ComPIL database derived from high-quality genomic libraries was able to detect nearly all of the same peptides as a search with a human database (~500x fewer peptides in the database), with a small reduction in sensitivity. We were also able to detect proteins from the adenovirus used to immortalize these cells. We applied our method to a set of healthy human gut microbiome proteomic samples and showed a substantial increase in the number of identified peptides and proteins compared to previous metaproteomic analyses, while retaining a high degree of protein identification accuracy and allowing for a more in-depth characterization of the functional landscape of the samples. The combination of ComPIL with Blazmass allows proteomic searches to be performed with database sizes much larger than previously possible. These large database searches can be applied to complex meta-samples with unknown composition or proteomic samples where unexpected proteins may be identified. The protein database, proteomic search engine, and the proteomic data files for the 5 microbiome samples characterized and discussed herein are open source and available for use and additional analysis.

Twitter Demographics

The data shown below were collected from the profiles of 10 tweeters who shared this research output. Click here to find out more about how the information was compiled.

Mendeley readers

The data shown below were compiled from readership statistics for 95 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United Kingdom 1 1%
Germany 1 1%
Unknown 93 98%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 22 23%
Researcher 18 19%
Student > Bachelor 12 13%
Student > Master 10 11%
Other 7 7%
Other 13 14%
Unknown 13 14%
Readers by discipline Count As %
Agricultural and Biological Sciences 26 27%
Biochemistry, Genetics and Molecular Biology 22 23%
Engineering 6 6%
Immunology and Microbiology 6 6%
Computer Science 5 5%
Other 11 12%
Unknown 19 20%

Attention Score in Context

This research output has an Altmetric Attention Score of 4. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 18 August 2017.
All research outputs
#2,857,132
of 11,626,228 outputs
Outputs from BMC Genomics
#1,753
of 6,949 outputs
Outputs of similar age
#69,279
of 238,527 outputs
Outputs of similar age from BMC Genomics
#69
of 265 outputs
Altmetric has tracked 11,626,228 research outputs across all sources so far. This one has received more attention than most of these and is in the 74th percentile.
So far Altmetric has tracked 6,949 research outputs from this source. They receive a mean Attention Score of 4.2. This one has gotten more attention than average, scoring higher than 73% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 238,527 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 70% of its contemporaries.
We're also able to compare this research output to 265 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 73% of its contemporaries.