Report for: GSNFS: Gene subnetwork biomarker identification of lung cancer expression data

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	GSNFS: Gene subnetwork biomarker identification of lung cancer expression data
Published in	BMC Medical Genomics, December 2016
DOI	10.1186/s12920-016-0231-4
Pubmed ID	28117655
Authors	Narumol Doungpan, Worrawat Engchuan, Jonathan H. Chan, Asawin Meechai
Abstract	Gene expression has been used to identify disease gene biomarkers, but there are ongoing challenges. Single gene or gene-set biomarkers are inadequate to provide sufficient understanding of complex disease mechanisms and the relationship among those genes. Network-based methods have thus been considered for inferring the interaction within a group of genes to further study the disease mechanism. Recently, the Gene-Network-based Feature Set (GNFS), which is capable of handling case-control and multiclass expression for gene biomarker identification, has been proposed, partly taking into account of network topology. However, its performance relies on a greedy search for building subnetworks and thus requires further improvement. In this work, we establish a new approach named Gene Sub-Network-based Feature Selection (GSNFS) by implementing the GNFS framework with two proposed searching and scoring algorithms, namely gene-set-based (GS) search and parent-node-based (PN) search, to identify subnetworks. An additional dataset is used to validate the results. The two proposed searching algorithms of the GSNFS method for subnetwork expansion are concerned with the degree of connectivity and the scoring scheme for building subnetworks and their topology. For each iteration of expansion, the neighbour genes of a current subnetwork, whose expression data improved the overall subnetwork score, is recruited. While the GS search calculated the subnetwork score using an activity score of a current subnetwork and the gene expression values of its neighbours, the PN search uses the expression value of the corresponding parent of each neighbour gene. Four lung cancer expression datasets were used for subnetwork identification. In addition, using pathway data and protein-protein interaction as network data in order to consider the interaction among significant genes were discussed. Classification was performed to compare the performance of the identified gene subnetworks with three subnetwork identification algorithms. The two searching algorithms resulted in better classification and gene/gene-set agreement compared to the original greedy search of the GNFS method. The identified lung cancer subnetwork using the proposed searching algorithm resulted in an improvement of the cross-dataset validation and an increase in the consistency of findings between two independent datasets. The homogeneity measurement of the datasets was conducted to assess dataset compatibility in cross-dataset validation. The lung cancer dataset with higher homogeneity showed a better result when using the GS search while the dataset with low homogeneity showed a better result when using the PN search. The 10-fold cross-dataset validation on the independent lung cancer datasets showed higher classification performance of the proposed algorithms when compared with the greedy search in the original GNFS method. The proposed searching algorithms provide a higher number of genes in the subnetwork expansion step than the greedy algorithm. As a result, the performance of the subnetworks identified from the GSNFS method was improved in terms of classification performance and gene/gene-set level agreement depending on the homogeneity of the datasets used in the analysis. Some common genes obtained from the four datasets using different searching algorithms are genes known to play a role in lung cancer. The improvement of classification performance and the gene/gene-set level agreement, and the biological relevance indicated the effectiveness of the GSNFS method for gene subnetwork identification using expression data.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United States	1	50%
Unknown	1	50%

Demographic breakdown

Type	Count	As %
Members of the public	2	100%

Mendeley readers

The data shown below were compiled from readership statistics for 19 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
Unknown	19	100%

Demographic breakdown

Readers by professional status	Count	As %
Student > Ph. D. Student	5	26%
Student > Master	4	21%
Student > Postgraduate	2	11%
Researcher	2	11%
Professor	1	5%
Other	1	5%
Unknown	4	21%

Readers by discipline	Count	As %
Computer Science	4	21%
Biochemistry, Genetics and Molecular Biology	3	16%
Pharmacology, Toxicology and Pharmaceutical Science	1	5%
Nursing and Health Professions	1	5%
Physics and Astronomy	1	5%
Other	2	11%
Unknown	7	37%

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 12 October 2017.

All research outputs

#18,563,836

of 22,992,311 outputs

Outputs from BMC Medical Genomics

#867

of 1,230 outputs

Outputs of similar age

#305,601

of 416,750 outputs

Outputs of similar age from BMC Medical Genomics

of 11 outputs

Altmetric has tracked 22,992,311 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.

So far Altmetric has tracked 1,230 research outputs from this source. They receive a mean Attention Score of 4.8. This one is in the 17th percentile – i.e., 17% of its peers scored the same or lower than it.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 416,750 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 15th percentile – i.e., 15% of its contemporaries scored the same or lower than it.

We're also able to compare this research output to 11 others from the same source and published within six weeks on either side of this one. This one is in the 18th percentile – i.e., 18% of its contemporaries scored the same or lower than it.

GSNFS: Gene subnetwork biomarker identification of lung cancer expression data

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context