DeepSA: A Deep-learning Driven Predictor of Compound Synthesis Accessibility

With the continuous development of artificial intelligence technology, more and more deep-generation models are used for molecule generation. However, most new molecules generated by the generation models often face great challenges in terms of synthetic accessibility.

DeepSA is proposed to predict synthesis accessibility of compounds, and has a much higher early enrichment rate in discriminating molecules that are difficult to synthesize. This helps users to select less expensive molecules for synthesis, thus reducing the time for drug discovery and development. You can use DeepSA on a webserver at https://bailab.siais.shanghaitech.edu.cn/deepsa

Requirements

Python == 3.8.13
scikit-learn == 1.0.2
pandas == 1.4.2
numpy == 1.21.6
matplotlib == 3.2.2

Dependencies can be installed using the following command:

conda create -n DeepSA python=3.8.13
conda activate DeepSA

pip3 install --upgrade pip==24.0
# for cpu version
pip3 install torch==1.12+cpu torchvision==0.13.0+cpu torchtext==0.13.0 -f https://download.pytorch.org/whl/cpu/torch_stable.html
# for gpu version
# pip3 install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchtext==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip3 install autogluon==0.5.2
pip3 install rdkit

Note 🔔

Because AutoGluon stopped supporting python version 3.8 starting in October. Therefore, if you have tried to configure the DeepSA environment in the recent past, there is a high probability that the full AutoGluon was not installed properly in the process, thus preventing you from running the program properly.

I recommend that you use python 3.9 and above when creating your environment and reconfigure it.

As the author is very busy recently, we will update DeepSA as soon as possible to fix this issue, thanks for your interest in DeepSA!

Data

The expand training and tes datasets could be easily downloaded at https://drive.google.com/drive/folders/1iup6T3Bqyy-uvpdFyP0Of_WQqn-9l62h?usp=sharing

Usage For Researchers

If you want to train your own model, you can run it from the command line,

running:

    python DeepSA_training.py <dataset.csv/training.csv:test.csv> DeepSA ./data/test_set.list

If you want to use the model we proposed,

running:

    python DeepSA.py <input_data.csv> DeepSA

Online Server

We deployed a pre-trained model on a dedicated server, which is publicly available at https://bailab.siais.shanghaitech.edu.cn/deepsa, to make it easy for biomedical researcher users to utilize DeepSA in their research activity.

Users can upload their SMILES or csv files to the server, and then they can quickly obtain the predicted results.

Citation

If you find this repository useful in your research, please consider citing our paper:

Wang, S., Wang, L., Li, F. et al. DeepSA: a deep-learning driven predictor of compound synthesis accessibility. J Cheminform 15, 103 (2023). https://doi.org/10.1186/s13321-023-00771-3

Contact

If you have any questions, please feel free to contact Shihang Wang (Email: [email protected]) or Lin Wang (Email: [email protected]).

Pull requests are highly welcomed!

Acknowledgments

We are grateful for the support from HPC Platform of ShanghaiTech University.
Thank you all for your attention to this work.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
DeepSA		DeepSA
data		data
DeepSA.py		DeepSA.py
DeepSA_training.py		DeepSA_training.py
README.md		README.md
conda_env.list		conda_env.list

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSA: A Deep-learning Driven Predictor of Compound Synthesis Accessibility

Requirements

Note 🔔

Data

Usage For Researchers

Online Server

Citation

Contact

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

Shihang-Wang-58/DeepSA

Folders and files

Latest commit

History

Repository files navigation

DeepSA: A Deep-learning Driven Predictor of Compound Synthesis Accessibility

Requirements

Note 🔔

Data

Usage For Researchers

Online Server

Citation

Contact

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages