Title |
Similarity maps - a visualization strategy for molecular fingerprints and machine-learning methods
|
---|---|
Published in |
Journal of Cheminformatics, September 2013
|
DOI | 10.1186/1758-2946-5-43 |
Pubmed ID | |
Authors |
Sereina Riniker, Gregory A Landrum |
Abstract |
: Fingerprint similarity is a common method for comparing chemical structures. Similarity is an appealing approach because, with many fingerprint types, it provides intuitive results: a chemist looking at two molecules can understand why they have been determined to be similar. This transparency is partially lost with the fuzzier similarity methods that are often used for scaffold hopping and tends to vanish completely when molecular fingerprints are used as inputs to machine-learning (ML) models. Here we present similarity maps, a straightforward and general strategy to visualize the atomic contributions to the similarity between two molecules or the predicted probability of a ML model. We show the application of similarity maps to a set of dopamine D3 receptor ligands using atom-pair and circular fingerprints as well as two popular ML methods: random forests and naïve Bayes. An open-source implementation of the method is provided. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
Sweden | 1 | 14% |
United States | 1 | 14% |
Spain | 1 | 14% |
Germany | 1 | 14% |
Unknown | 3 | 43% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 5 | 71% |
Scientists | 2 | 29% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Brazil | 5 | 2% |
Germany | 4 | 1% |
Netherlands | 2 | <1% |
Portugal | 2 | <1% |
United States | 2 | <1% |
Italy | 1 | <1% |
China | 1 | <1% |
Kenya | 1 | <1% |
Japan | 1 | <1% |
Other | 1 | <1% |
Unknown | 260 | 93% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 61 | 22% |
Researcher | 55 | 20% |
Student > Master | 34 | 12% |
Student > Bachelor | 20 | 7% |
Other | 14 | 5% |
Other | 42 | 15% |
Unknown | 54 | 19% |
Readers by discipline | Count | As % |
---|---|---|
Chemistry | 71 | 25% |
Agricultural and Biological Sciences | 32 | 11% |
Computer Science | 30 | 11% |
Biochemistry, Genetics and Molecular Biology | 26 | 9% |
Pharmacology, Toxicology and Pharmaceutical Science | 16 | 6% |
Other | 40 | 14% |
Unknown | 65 | 23% |