Report for: The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS
Published in	Journal of Cheminformatics, January 2016
DOI	10.1186/s13321-016-0113-y
Pubmed ID	26807157
Authors	Igor V. Tetko, Daniel M. Lowe, Antony J. Williams
Abstract	Melting point (MP) is an important property in regards to the solubility of chemical compounds. Its prediction from chemical structure remains a highly challenging task for quantitative structure-activity relationship studies. Success in this area of research critically depends on the availability of high quality MP data as well as accurate chemical structure representations in order to develop models. Currently, available datasets for MP predictions have been limited to around 50k molecules while lots more data are routinely generated following the synthesis of novel materials. Significant amounts of MP data are freely available within the patent literature and, if it were available in the appropriate form, could potentially be used to develop predictive models. We have developed a pipeline for the automated extraction and annotation of chemical data from published PATENTS. Almost 300,000 data points have been collected and used to develop models to predict melting and pyrolysis (decomposition) points using tools available on the OCHEM modeling platform (http://ochem.eu). A number of technical challenges were simultaneously solved to develop models based on these data. These included the handing of sparse data matrices with >200,000,000,000 entries and parallel calculations using 32 × 6 cores per task using 13 descriptor sets totaling more than 700,000 descriptors. We showed that models developed using data collected from PATENTS had similar or better prediction accuracy compared to the highly curated data used in previous publications. The separation of data for chemicals that decomposed rather than melting, from compounds that did undergo a normal melting transition, was performed and models for both pyrolysis and MPs were developed. The accuracy of the consensus MP models for molecules from the drug-like region of chemical space was similar to their estimated experimental accuracy, 32 °C. Last but not least, important structural features related to the pyrolysis of chemicals were identified, and a model to predict whether a compound will decompose instead of melting was developed. We have shown that automated tools for the analysis of chemical information have reached a mature stage allowing for the extraction and collection of high quality data to enable the development of structure-activity relationship models. The developed models and data are publicly available at http://ochem.eu/article/99826.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 8 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United States	3	38%
Germany	1	13%
Canada	1	13%
Unknown	3	38%

Demographic breakdown

Type	Count	As %
Members of the public	6	75%
Scientists	2	25%

Mendeley readers

The data shown below were compiled from readership statistics for 78 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
United States	1	1%
Bulgaria	1	1%
Germany	1	1%
Unknown	75	96%

Demographic breakdown

Readers by professional status	Count	As %
Researcher	22	28%
Student > Ph. D. Student	18	23%
Student > Bachelor	9	12%
Other	4	5%
Student > Doctoral Student	3	4%
Other	12	15%
Unknown	10	13%

Readers by discipline	Count	As %
Chemistry	22	28%
Computer Science	10	13%
Pharmacology, Toxicology and Pharmaceutical Science	8	10%
Engineering	5	6%
Medicine and Dentistry	4	5%
Other	14	18%
Unknown	15	19%

Attention Score in Context

This research output has an Altmetric Attention Score of 21. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 27 November 2021.

All research outputs

#1,680,387

of 24,143,470 outputs

Outputs from Journal of Cheminformatics

#136

of 891 outputs

Outputs of similar age

#30,391

of 403,573 outputs

Outputs of similar age from Journal of Cheminformatics

of 16 outputs

Altmetric has tracked 24,143,470 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 93rd percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.

So far Altmetric has tracked 891 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 10.7. This one has done well, scoring higher than 84% of its peers.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 403,573 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 92% of its contemporaries.

We're also able to compare this research output to 16 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 87% of its contemporaries.

The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS

About this Attention Score

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context