Replies: 2 comments
-
Hi, Sorry for the late reply. You can find an example script for calculating the bond difference between two molecules with the attached example script here: bond_difference.txt. To run the example script, please use the following command: conda create -n msentropy
conda activate msentropy
conda install -c conda-forge python=3.9 rdkit=2020.03.6
python bond_difference.txt Please be aware that the script only works with the rdkit version 2020.03.6. Using a different version may result in exceptions for some molecules. If you have any questions, feel free to let me know. The example scripts: from rdkit import Chem
from rdkit.Chem import rdFMCS
def __remove_bonds_in_smarts(mol, smarts):
removed_bond_number = 0
sub_atom = mol.GetSubstructMatch(Chem.MolFromSmarts(smarts))
# print(sub_atom)
for i in sub_atom:
all_bonds = mol.GetAtomWithIdx(i).GetBonds()
for bond in all_bonds:
atom_1, atom_2 = bond.GetBeginAtomIdx(), bond.GetEndAtomIdx()
mol.RemoveBond(atom_1, atom_2)
if atom_1 in sub_atom and atom_2 in sub_atom:
removed_bond_number += 1
return mol, removed_bond_number
def calculate_similarity(mol1, mol2):
mol1 = Chem.rdchem.RWMol(mol1)
mol2 = Chem.rdchem.RWMol(mol2)
bond_number_common = 0
bond_number_mol1 = len(mol1.GetBonds())
bond_number_mol2 = len(mol2.GetBonds())
while True:
# print(Chem.MolToSmiles(mol1))
# print(Chem.MolToSmiles(mol2))
res = rdFMCS.FindMCS(
[mol1, mol2],
timeout=60,
threshold=1,
ringMatchesRingOnly=True,
completeRingsOnly=True,
atomCompare=rdFMCS.AtomCompare.CompareElements,
bondCompare=rdFMCS.BondCompare.CompareOrderExact,
ringCompare=rdFMCS.RingCompare.StrictRingFusion,
maximizeBonds=True,
matchValences=True,
)
if res.numBonds == 0:
break
common_s = res.smartsString
# print(common_s)
mol1, _ = __remove_bonds_in_smarts(mol1, common_s)
mol2, _ = __remove_bonds_in_smarts(mol2, common_s)
bond_number_common += res.numBonds
# print(bond_number_common)
return {
"mol1_bond_number": bond_number_mol1,
"mol2_bond_number": bond_number_mol2,
"common_bond_number": bond_number_common,
"bond_difference": (bond_number_mol1 + bond_number_mol2) / 2 - bond_number_common,
"bond_similarity": (2 * bond_number_common) / (bond_number_mol1 + bond_number_mol2),
}
if __name__ == "__main__":
import pprint
s1 = "InChI=1S/C10H7NO2/c12-10(13)9-6-5-7-3-1-2-4-8(7)11-9/h1-6H,(H,12,13)"
s2 = "InChI=1S/C10H7NO2/c12-10(13)9-5-7-3-1-2-4-8(7)6-11-9/h1-6H,(H,12,13)"
mol_a = Chem.MolFromInchi(s1)
mol_b = Chem.MolFromInchi(s2)
result = calculate_similarity(mol_a, mol_b)
print(f"Molecule 1: {Chem.MolToSmiles(mol_a)}")
print(f"Molecule 2: {Chem.MolToSmiles(mol_b)}")
pprint.pprint(result)
print()
s1 = "CC(CCC)c1ccccc1"
s2 = "CCC(CC)c1ccccc1"
mol_a = Chem.rdchem.RWMol(Chem.MolFromSmiles(s1))
mol_b = Chem.rdchem.RWMol(Chem.MolFromSmiles(s2))
result = calculate_similarity(mol_a, mol_b)
print(f"Molecule 1: {Chem.MolToSmiles(mol_a)}")
print(f"Molecule 2: {Chem.MolToSmiles(mol_b)}")
pprint.pprint(result) The output looks like:
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Thank you for the detailed scripts. It works in my computer. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Thank you for the extensive investigation of 43 similarity scores leading to the conclusion of Spectral entropy outperforms MS/MS dot product similarity. I noticed the one bond difference is defined to classify true or false identification. I am wondering which tools are used for calculating the bond difference between molecules for Extended Data Fig. 7.
Beta Was this translation helpful? Give feedback.
All reactions