Propersea - Property Prediction

Propersea is an online resource to provide predictions for a range of molecular and physicochemical properties for small molecules. The predicted properties include: melting point, boiling point, density, logP, solubility, polarizability and more. It also predicts the IUPAC name for the molecule.

Propersea is available to search using the PSDI Cross Data Search service, where it can be searched using a SMILES string, InChI (including InChI=), or a structure. Once the search is complete the user will be shown the results of the predictions for that molecule.

Property prediction

The properties are predicted through a variety of algorithms, including:

RDKit algorithms
Semi-empirical quantum methods
Fragment/ atom contribution calculations
Bayesian Additive Regression Trees
Transformer neural networks

The predicted value is returned in the results interface. For those properties predicted using the Bayesian algorithms it also returns an interval for the 95% confidence, along with a measure of how well the molecule compares to molecules contained in the training set. Where a property prediction is deemed non-sensical due to the predicted phase, the property may be omitted from results.

Propersea performs best for organic compounds and performance on inorganics, organometallics and inorganic-organic mixtures is known to be lower.

IUPAC Name Prediction

Propersea also features a novel machine learning model for generation of IUPAC names. This machine learning model is a sequence-to-sequence model that can predict the IUPAC name from the molecules InChI string. The model has been trained on a dataset of 10 million compounds and tested on a 200,000 compound dataset, achieving an accuracy of 90.7% on a complete match to the IUPAC name. This model performs extremely well with organic compounds, and also handles isomers / tautomers that are adequately described by the InChI.

However the current model does not perform well on inorganics, organometallics, and inorganic-organic mixtures. This is in part likely due to the limitations of the InChI in describing these molecules, and also in the quality and quantity of the molecules in the training dataset. Work is ongoing to improve the performance of the model in these areas.

For more information about this model, see Translating the molecules: adapting neural machine translation to predict IUPAC names from a chemical identifier.

What to do next

Related links:

About this page

Creator: Cerys Willoughby
Last modified date: 2025-10-17
Citation: Please cite: Cerys Willoughby, Propersea - Property Prediction, https://guidance.psdi.ac.uk/docusaurus-pages/docs/guidance/psdi-resources/propersea/, PSDI (modified 2025-05-02)
License: CC-BY-4.0

If you would like to contribute content to the PSDI Knowledge Base or have feedback you would like to give on this guidance, please contact us.

Propersea - Property Prediction

Property prediction

IUPAC Name Prediction

What to do next

Funding

Useful Links

Connect

Propersea - Property Prediction

Property prediction​

IUPAC Name Prediction​

What to do next​

Funding

Useful Links

Connect

Property prediction

IUPAC Name Prediction

What to do next