Skip to content

Latest commit

 

History

History
55 lines (47 loc) · 2.57 KB

README.md

File metadata and controls

55 lines (47 loc) · 2.57 KB

Diffleop

Summary

Diffleop is a 3D pocket-aware and affinity guided diffusion model to perform scaffold decoration, fragment-linking and scaffold-hopping for molecular optimization with enhanced binding affinity through a unified framework.

architecture

Install conda environment via conda yaml file

conda env create -f environment.yaml

Datasets

We constructed a demo dataset of 500 data points using CrossDocked dataset for scaffold decoration and linker design, respectively.

Training

To train a model for scaffold decoration task, run:

python -W ignore scripts/train.py configs/training_dec.yml --device cuda:0 --type dec

To train a model for linker design task, run:

python -W ignore scripts/train.py configs/training_linker.yml --device cuda:0 --type linker

Sampling

You can sample molecules for each input scaffold or fragments and protein pocket and change the corresponding parameters in the config file. You can also download the model checkpoint file from this link and save it into ckpt/. Run the following:

python -W ignore scripts/sample.py configs/sampling_dec.yml -i 1 --device cuda:0 --type dec

You will get .sdf files of the generated molecules in the directory outputs/sampling/sdf.

Evaluation

Before calculating the binding affinities between molecules and proteins, you should clone TANKBind repository. Place ./scripts/evaluation/cal_affinity.py into TankBind/examples, and you can run evaluation script after sampling molecules:

python cal_affinity.py --dataset_dir /path/to/dataset_dir --samples_dir ./outputs/sampling/sdf

Sampling for a specific protein pocket and specific fragment(s)

To generate molecules for your own pocket and fragment(s), you need to provide the pdb structure file of the protein pocket, the sdf file of the fragment(s), and the index of the anchor of the fragment(s). For Example:

python scripts/sample_for_case.py configs/sampling_dec_for_case.yml -i 1 --device cuda:0 --type dec --protein_filename ./data/case/dec/2fjp_A.pdb --ligand_filename ./data/case/dec/retain.sdf --anchor_id_given_1 4

You can use the following code to visualize the index of the molecule.

from rdkit import Chem
mol = Chem.SDMolSupplier('/path/to/example.sdf')[0]
mol.RemoveAllConformers()
for atom in mol.GetAtoms():
    atom.SetProp('molAtomMapNumber', str(atom.GetIdx()))
Chem.Draw.MolToImage(mol, size=(500,500))