Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix division by 0 issue when all scores are zero #24

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AlanKerstjens
Copy link

In graph_ga/goal_directed_generation.py the mating pool is selected through weighted random sampling, with the probability of a molecule being selected given by its score divided by the sum of all scores (sum_scores).

def make_mating_pool(population_mol: List[Mol], population_scores, offspring_size: int):
"""
Given a population of RDKit Mol and their scores, sample a list of the same size
with replacement using the population_scores as weights
Args:
population_mol: list of RDKit Mol
population_scores: list of un-normalised scores given by ScoringFunction
offspring_size: number of molecules to return
Returns: a list of RDKit Mol (probably not unique)
"""
# scores -> probs
sum_scores = sum(population_scores)
population_probs = [p / sum_scores for p in population_scores]
mating_pool = np.random.choice(population_mol, p=population_probs, size=offspring_size, replace=True)
return mating_pool

If sum_scores == 0.0 we get a division by 0. This actually occurs during the Valsartan SMARTS benchmark when using random starting molecules due to its binary nature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant