↓ Skip to main content

Variable selection for disease progression models: methods for oncogenetic trees and application to cancer and HIV

Overview of attention for article published in BMC Bioinformatics, August 2017
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
3 Dimensions

Readers on

mendeley
21 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Variable selection for disease progression models: methods for oncogenetic trees and application to cancer and HIV
Published in
BMC Bioinformatics, August 2017
DOI 10.1186/s12859-017-1762-1
Pubmed ID
Authors

Katrin Hainke, Sebastian Szugat, Roland Fried, Jörg Rahnenführer

Abstract

Disease progression models are important for understanding the critical steps during the development of diseases. The models are imbedded in a statistical framework to deal with random variations due to biology and the sampling process when observing only a finite population. Conditional probabilities are used to describe dependencies between events that characterise the critical steps in the disease process. Many different model classes have been proposed in the literature, from simple path models to complex Bayesian networks. A popular and easy to understand but yet flexible model class are oncogenetic trees. These have been applied to describe the accumulation of genetic aberrations in cancer and HIV data. However, the number of potentially relevant aberrations is often by far larger than the maximal number of events that can be used for reliably estimating the progression models. Still, there are only a few approaches to variable selection, which have not yet been investigated in detail. We fill this gap and propose specifically for oncogenetic trees ten variable selection methods, some of these being completely new. We compare them in an extensive simulation study and on real data from cancer and HIV. It turns out that the preselection of events by clique identification algorithms performs best. Here, events are selected if they belong to the largest or the maximum weight subgraph in which all pairs of vertices are connected. The variable selection method of identifying cliques finds both the important frequent events and those related to disease pathways.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 21 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 21 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 5 24%
Student > Master 3 14%
Other 2 10%
Student > Doctoral Student 1 5%
Lecturer > Senior Lecturer 1 5%
Other 1 5%
Unknown 8 38%
Readers by discipline Count As %
Pharmacology, Toxicology and Pharmaceutical Science 2 10%
Engineering 2 10%
Neuroscience 2 10%
Biochemistry, Genetics and Molecular Biology 1 5%
Psychology 1 5%
Other 3 14%
Unknown 10 48%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 02 August 2017.
All research outputs
#18,566,650
of 22,996,001 outputs
Outputs from BMC Bioinformatics
#6,347
of 7,310 outputs
Outputs of similar age
#243,077
of 317,441 outputs
Outputs of similar age from BMC Bioinformatics
#79
of 95 outputs
Altmetric has tracked 22,996,001 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,310 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 5th percentile – i.e., 5% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 317,441 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 12th percentile – i.e., 12% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 95 others from the same source and published within six weeks on either side of this one. This one is in the 10th percentile – i.e., 10% of its contemporaries scored the same or lower than it.