↓ Skip to main content

GEMINI: a computationally-efficient search engine for large gene expression datasets

Overview of attention for article published in BMC Bioinformatics, February 2016
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • Good Attention Score compared to outputs of the same age (76th percentile)
  • Good Attention Score compared to outputs of the same age and source (74th percentile)

Mentioned by

twitter
13 X users

Readers on

mendeley
31 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
GEMINI: a computationally-efficient search engine for large gene expression datasets
Published in
BMC Bioinformatics, February 2016
DOI 10.1186/s12859-016-0934-8
Pubmed ID
Authors

Timothy DeFreitas, Hachem Saddiki, Patrick Flaherty

Abstract

Low-cost DNA sequencing allows organizations to accumulate massive amounts of genomic data and use that data to answer a diverse range of research questions. Presently, users must search for relevant genomic data using a keyword, accession number of meta-data tag. However, in this search paradigm the form of the query - a text-based string - is mismatched with the form of the target - a genomic profile. To improve access to massive genomic data resources, we have developed a fast search engine, GEMINI, that uses a genomic profile as a query to search for similar genomic profiles. GEMINI implements a nearest-neighbor search algorithm using a vantage-point tree to store a database of n profiles and in certain circumstances achieves an [Formula: see text] expected query time in the limit. We tested GEMINI on breast and ovarian cancer gene expression data from The Cancer Genome Atlas project and show that it achieves a query time that scales as the logarithm of the number of records in practice on genomic data. In a database with 10(5) samples, GEMINI identifies the nearest neighbor in 0.05 sec compared to a brute force search time of 0.6 sec. GEMINI is a fast search engine that uses a query genomic profile to search for similar profiles in a very large genomic database. It enables users to identify similar profiles independent of sample label, data origin or other meta-data information.

X Demographics

X Demographics

The data shown below were collected from the profiles of 13 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 31 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 1 3%
Unknown 30 97%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 8 26%
Researcher 6 19%
Student > Master 4 13%
Student > Bachelor 3 10%
Student > Doctoral Student 2 6%
Other 6 19%
Unknown 2 6%
Readers by discipline Count As %
Computer Science 9 29%
Agricultural and Biological Sciences 6 19%
Biochemistry, Genetics and Molecular Biology 4 13%
Mathematics 3 10%
Engineering 3 10%
Other 4 13%
Unknown 2 6%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 7. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 22 March 2016.
All research outputs
#4,460,991
of 22,851,489 outputs
Outputs from BMC Bioinformatics
#1,674
of 7,292 outputs
Outputs of similar age
#69,393
of 298,866 outputs
Outputs of similar age from BMC Bioinformatics
#37
of 144 outputs
Altmetric has tracked 22,851,489 research outputs across all sources so far. Compared to these this one has done well and is in the 80th percentile: it's in the top 25% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 7,292 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one has done well, scoring higher than 76% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 298,866 tracked outputs that were published within six weeks on either side of this one in any source. This one has done well, scoring higher than 76% of its contemporaries.
We're also able to compare this research output to 144 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 74% of its contemporaries.