↓ Skip to main content

Positive-unlabeled learning for the prediction of conformational B-cell epitopes

Overview of attention for article published in BMC Bioinformatics, December 2015
Altmetric Badge

Citations

dimensions_citation
24 Dimensions

Readers on

mendeley
55 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Positive-unlabeled learning for the prediction of conformational B-cell epitopes
Published in
BMC Bioinformatics, December 2015
DOI 10.1186/1471-2105-16-s18-s12
Pubmed ID
Authors

Jing Ren, Qian Liu, John Ellis, Jinyan Li

Abstract

The incomplete ground truth of training data of B-cell epitopes is a demanding issue in computational epitope prediction. The challenge is that only a small fraction of the surface residues of an antigen are confirmed as antigenic residues (positive training data); the remaining residues are unlabeled. As some of these uncertain residues can possibly be grouped to form novel but currently unknown epitopes, it is misguided to unanimously classify all the unlabeled residues as negative training data following the traditional supervised learning scheme. We propose a positive-unlabeled learning algorithm to address this problem. The key idea is to distinguish between epitope-likely residues and reliable negative residues in unlabeled data. The method has two steps: (1) identify reliable negative residues using a weighted SVM with a high recall; and (2) construct a classification model on the positive residues and the reliable negative residues. Complex-based 10-fold cross-validation was conducted to show that this method outperforms those commonly used predictors DiscoTope 2.0, ElliPro and SEPPA 2.0 in every aspect. We conducted four case studies, in which the approach was tested on antigens of West Nile virus, dihydrofolate reductase, beta-lactamase, and two Ebola antigens whose epitopes are currently unknown. All the results were assessed on a newly-established data set of antigen structures not bound by antibodies, instead of on antibody-bound antigen structures. These bound structures may contain unfair binding information such as bound-state B-factors and protrusion index which could exaggerate the epitope prediction performance. Source codes are available on request.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 55 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 55 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 13 24%
Student > Ph. D. Student 10 18%
Student > Bachelor 9 16%
Other 5 9%
Librarian 3 5%
Other 9 16%
Unknown 6 11%
Readers by discipline Count As %
Computer Science 10 18%
Agricultural and Biological Sciences 8 15%
Biochemistry, Genetics and Molecular Biology 6 11%
Medicine and Dentistry 6 11%
Immunology and Microbiology 4 7%
Other 11 20%
Unknown 10 18%