↓ Skip to main content

GSNFS: Gene subnetwork biomarker identification of lung cancer expression data

Overview of attention for article published in BMC Medical Genomics, December 2016
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
14 Dimensions

Readers on

mendeley
19 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
GSNFS: Gene subnetwork biomarker identification of lung cancer expression data
Published in
BMC Medical Genomics, December 2016
DOI 10.1186/s12920-016-0231-4
Pubmed ID
Authors

Narumol Doungpan, Worrawat Engchuan, Jonathan H. Chan, Asawin Meechai

Abstract

Gene expression has been used to identify disease gene biomarkers, but there are ongoing challenges. Single gene or gene-set biomarkers are inadequate to provide sufficient understanding of complex disease mechanisms and the relationship among those genes. Network-based methods have thus been considered for inferring the interaction within a group of genes to further study the disease mechanism. Recently, the Gene-Network-based Feature Set (GNFS), which is capable of handling case-control and multiclass expression for gene biomarker identification, has been proposed, partly taking into account of network topology. However, its performance relies on a greedy search for building subnetworks and thus requires further improvement. In this work, we establish a new approach named Gene Sub-Network-based Feature Selection (GSNFS) by implementing the GNFS framework with two proposed searching and scoring algorithms, namely gene-set-based (GS) search and parent-node-based (PN) search, to identify subnetworks. An additional dataset is used to validate the results. The two proposed searching algorithms of the GSNFS method for subnetwork expansion are concerned with the degree of connectivity and the scoring scheme for building subnetworks and their topology. For each iteration of expansion, the neighbour genes of a current subnetwork, whose expression data improved the overall subnetwork score, is recruited. While the GS search calculated the subnetwork score using an activity score of a current subnetwork and the gene expression values of its neighbours, the PN search uses the expression value of the corresponding parent of each neighbour gene. Four lung cancer expression datasets were used for subnetwork identification. In addition, using pathway data and protein-protein interaction as network data in order to consider the interaction among significant genes were discussed. Classification was performed to compare the performance of the identified gene subnetworks with three subnetwork identification algorithms. The two searching algorithms resulted in better classification and gene/gene-set agreement compared to the original greedy search of the GNFS method. The identified lung cancer subnetwork using the proposed searching algorithm resulted in an improvement of the cross-dataset validation and an increase in the consistency of findings between two independent datasets. The homogeneity measurement of the datasets was conducted to assess dataset compatibility in cross-dataset validation. The lung cancer dataset with higher homogeneity showed a better result when using the GS search while the dataset with low homogeneity showed a better result when using the PN search. The 10-fold cross-dataset validation on the independent lung cancer datasets showed higher classification performance of the proposed algorithms when compared with the greedy search in the original GNFS method. The proposed searching algorithms provide a higher number of genes in the subnetwork expansion step than the greedy algorithm. As a result, the performance of the subnetworks identified from the GSNFS method was improved in terms of classification performance and gene/gene-set level agreement depending on the homogeneity of the datasets used in the analysis. Some common genes obtained from the four datasets using different searching algorithms are genes known to play a role in lung cancer. The improvement of classification performance and the gene/gene-set level agreement, and the biological relevance indicated the effectiveness of the GSNFS method for gene subnetwork identification using expression data.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 19 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 19 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 5 26%
Student > Master 4 21%
Student > Postgraduate 2 11%
Researcher 2 11%
Professor 1 5%
Other 1 5%
Unknown 4 21%
Readers by discipline Count As %
Computer Science 4 21%
Biochemistry, Genetics and Molecular Biology 3 16%
Pharmacology, Toxicology and Pharmaceutical Science 1 5%
Nursing and Health Professions 1 5%
Physics and Astronomy 1 5%
Other 2 11%
Unknown 7 37%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 12 October 2017.
All research outputs
#18,563,836
of 22,992,311 outputs
Outputs from BMC Medical Genomics
#867
of 1,230 outputs
Outputs of similar age
#305,601
of 416,750 outputs
Outputs of similar age from BMC Medical Genomics
#8
of 11 outputs
Altmetric has tracked 22,992,311 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.
So far Altmetric has tracked 1,230 research outputs from this source. They receive a mean Attention Score of 4.8. This one is in the 17th percentile – i.e., 17% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 416,750 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 15th percentile – i.e., 15% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 11 others from the same source and published within six weeks on either side of this one. This one is in the 18th percentile – i.e., 18% of its contemporaries scored the same or lower than it.