Title |
Machine learning classifiers provide insight into the relationship between microbial communities and bacterial vaginosis
|
---|---|
Published in |
BioData Mining, August 2015
|
DOI | 10.1186/s13040-015-0055-3 |
Pubmed ID | |
Authors |
Daniel Beck, James A. Foster |
Abstract |
Bacterial vaginosis (BV) is a disease associated with the vagina microbiome. It is highly prevalent and is characterized by symptoms including odor, discharge and irritation. No single microbe has been found to cause BV. In this paper we use random forests and logistic regression classifiers to model the relationship between the microbial community and BV. We use subsets of the microbial community features in order to determine which features are important to the classification models. We find that models generated using logistic regression and random forests perform nearly identically and identify largely similar important features. Only a few features are necessary to obtain high BV classification accuracy. Additionally, there appears to be substantial redundancy between the microbial community features. These results are in contrast to a previous study in which the important features identified by the classifiers were dissimilar. This difference appears to be the result of using different feature importance measures. It is not clear whether machine learning classifiers are capturing patterns different from simple correlations. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 3 | 33% |
Unknown | 6 | 67% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 7 | 78% |
Practitioners (doctors, other healthcare professionals) | 1 | 11% |
Scientists | 1 | 11% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 1 | 2% |
Slovenia | 1 | 2% |
Unknown | 43 | 96% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 12 | 27% |
Student > Master | 6 | 13% |
Student > Ph. D. Student | 6 | 13% |
Student > Bachelor | 4 | 9% |
Student > Postgraduate | 2 | 4% |
Other | 5 | 11% |
Unknown | 10 | 22% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 9 | 20% |
Biochemistry, Genetics and Molecular Biology | 7 | 16% |
Computer Science | 4 | 9% |
Medicine and Dentistry | 4 | 9% |
Immunology and Microbiology | 4 | 9% |
Other | 7 | 16% |
Unknown | 10 | 22% |