Report for: Interactive knowledge discovery and data mining on genomic expression data with numeric formal concept analysis

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	Interactive knowledge discovery and data mining on genomic expression data with numeric formal concept analysis
Published in	BMC Bioinformatics, September 2016
DOI	10.1186/s12859-016-1234-z
Pubmed ID	27628041
Authors	Jose M González-Calabozo, Francisco J Valverde-Albacete, Carmen Peláez-Moreno
Abstract	Gene Expression Data (GED) analysis poses a great challenge to the scientific community that can be framed into the Knowledge Discovery in Databases (KDD) and Data Mining (DM) paradigm. Biclustering has emerged as the machine learning method of choice to solve this task, but its unsupervised nature makes result assessment problematic. This is often addressed by means of Gene Set Enrichment Analysis (GSEA). We put forward a framework in which GED analysis is understood as an Exploratory Data Analysis (EDA) process where we provide support for continuous human interaction with data aiming at improving the step of hypothesis abduction and assessment. We focus on the adaptation to human cognition of data interpretation and visualization of the output of EDA. First, we give a proper theoretical background to bi-clustering using Lattice Theory and provide a set of analysis tools revolving around [Formula: see text]-Formal Concept Analysis ([Formula: see text]-FCA), a lattice-theoretic unsupervised learning technique for real-valued matrices. By using different kinds of cost structures to quantify expression we obtain different sequences of hierarchical bi-clusterings for gene under- and over-expression using thresholds. Consequently, we provide a method with interleaved analysis steps and visualization devices so that the sequences of lattices for a particular experiment summarize the researcher's vision of the data. This also allows us to define measures of persistence and robustness of biclusters to assess them. Second, the resulting biclusters are used to index external omics databases-for instance, Gene Ontology (GO)-thus offering a new way of accessing publicly available resources. This provides different flavors of gene set enrichment against which to assess the biclusters, by obtaining their p-values according to the terminology of those resources. We illustrate the exploration procedure on a real data example confirming results previously published. The GED analysis problem gets transformed into the exploration of a sequence of lattices enabling the visualization of the hierarchical structure of the biclusters with a certain degree of granularity. The ability of FCA-based bi-clustering methods to index external databases such as GO allows us to obtain a quality measure of the biclusters, to observe the evolution of a gene throughout the different biclusters it appears in, to look for relevant biclusters-by observing their genes and what their persistence is-to infer, for instance, hypotheses on their function.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 5 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United Kingdom	1	20%
Unknown	4	80%

Demographic breakdown

Type	Count	As %
Members of the public	3	60%
Scientists	2	40%

Mendeley readers

The data shown below were compiled from readership statistics for 33 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
France	1	3%
Canada	1	3%
Unknown	31	94%

Demographic breakdown

Readers by professional status	Count	As %
Researcher	11	33%
Student > Ph. D. Student	5	15%
Student > Bachelor	3	9%
Student > Master	3	9%
Professor	1	3%
Other	1	3%
Unknown	9	27%

Readers by discipline	Count	As %
Computer Science	12	36%
Biochemistry, Genetics and Molecular Biology	4	12%
Business, Management and Accounting	2	6%
Agricultural and Biological Sciences	2	6%
Engineering	2	6%
Other	2	6%
Unknown	9	27%

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 16 September 2016.

All research outputs

#13,989,437

of 22,888,307 outputs

Outputs from BMC Bioinformatics

#4,488

of 7,298 outputs

Outputs of similar age

#177,219

of 321,166 outputs

Outputs of similar age from BMC Bioinformatics

#59

of 120 outputs

Altmetric has tracked 22,888,307 research outputs across all sources so far. This one is in the 37th percentile – i.e., 37% of other outputs scored the same or lower than it.

So far Altmetric has tracked 7,298 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 35th percentile – i.e., 35% of its peers scored the same or lower than it.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 321,166 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 43rd percentile – i.e., 43% of its contemporaries scored the same or lower than it.

We're also able to compare this research output to 120 others from the same source and published within six weeks on either side of this one. This one is in the 45th percentile – i.e., 45% of its contemporaries scored the same or lower than it.

Interactive knowledge discovery and data mining on genomic expression data with numeric formal concept analysis

About this Attention Score

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context