Title |
Gene selection and classification of microarray data using random forest
|
---|---|
Published in |
BMC Bioinformatics, January 2006
|
DOI | 10.1186/1471-2105-7-3 |
Pubmed ID | |
Authors |
Ramón Díaz-Uriarte, Sara Alvarez de Andrés |
Abstract |
Selection of relevant genes for sample classification is a common task in most gene expression studies, where researchers try to identify the smallest possible set of genes that can still achieve good predictive performance (for instance, for future use with diagnostic purposes in clinical practice). Many gene selection approaches use univariate (gene-by-gene) rankings of gene relevance and arbitrary thresholds to select the number of genes, can only be applied to two-class problems, and use gene selection ranking criteria unrelated to the classification algorithm. In contrast, random forest is a classification algorithm well suited for microarray data: it shows excellent performance even when most predictive variables are noise, can be used when the number of variables is much larger than the number of observations and in problems involving more than two classes, and returns measures of variable importance. Thus, it is important to understand the performance of random forest with microarray data and its possible use for gene selection. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
India | 1 | 50% |
Unknown | 1 | 50% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 2 | 100% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 24 | 2% |
United Kingdom | 8 | <1% |
India | 7 | <1% |
Brazil | 7 | <1% |
Australia | 7 | <1% |
Spain | 6 | <1% |
Canada | 6 | <1% |
Germany | 5 | <1% |
Italy | 4 | <1% |
Other | 31 | 2% |
Unknown | 1400 | 93% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 392 | 26% |
Researcher | 231 | 15% |
Student > Master | 215 | 14% |
Student > Bachelor | 103 | 7% |
Student > Doctoral Student | 76 | 5% |
Other | 235 | 16% |
Unknown | 253 | 17% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 276 | 18% |
Agricultural and Biological Sciences | 255 | 17% |
Engineering | 124 | 8% |
Biochemistry, Genetics and Molecular Biology | 100 | 7% |
Environmental Science | 88 | 6% |
Other | 331 | 22% |
Unknown | 331 | 22% |