Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brainstorming ideas for speeding up YANK #582

Open
5 of 19 tasks
jchodera opened this issue Nov 25, 2016 · 4 comments
Open
5 of 19 tasks

Brainstorming ideas for speeding up YANK #582

jchodera opened this issue Nov 25, 2016 · 4 comments

Comments

@jchodera
Copy link
Member

jchodera commented Nov 25, 2016

I'm just collecting some ideas to consider after the 1.0 release.

Overall optimization

  • Optimize code based on profiling
  • Non-cubic/rectangular boxes (truncated octahedron, dodecahedron)
  • Tune nonbonded cutoffs to balance performance and overlap
  • Optimize PME parameters to maximize performance given error threshold
  • Use crude PME tolerances in intermediate states
  • optimize alchemical pathway to minimize number of thermodynamic states

Startup time

  • Use FIRE minimizer for minimizations instead of very slow LocalEnergyMinimizer
  • Pre-equilibrate fully-interacting system

Propagation

  • Adjust number of steps per alchemical state to minimize discrepancy between end times for propagation steps
  • Use better integrators that permit larger timestep, such as g-BAOAB and other multiple-timestep algorithms that make use of hydrogen mass repartitioning, solute/solvent splitting, force splitting, etc. This may give a 2-6x speedup of propagation.
  • Speed up ligand Monte Carlo using a CustomIntegrator or applying it only to weakly interacting replicas
  • Use distributed computing framework that allows multiple YANK jobs to share same set of worker nodes and interleave free energy calculations
  • Allow Monte Carlo rotation/translation to be disabled for Boresch restraints, since none will be accepted. (Perhaps ring flips could be used instead?) Would save ~16 s / 124s per iteration for Abl:imatinib (~13%)
  • Consolidate sanity checks to shave off some time from _propagate_replicas.
  • Have OpenMM use pairlists for interaction groups (Efficiency considerations in CustomNonbondedForce.addInteractionGroup? openmm/openmm#1765)

Energy computation

Writing configurations

  • Have separate thread write data to disk
  • Only write full system positions checkpoints to disk every N iterations; only resume from these points (for Abl:imatinib, writing currently consumes 42s / 141s ~ 30%). (Implemented in Checkpointing Feature #675)
@jchodera jchodera modified the milestone: yank 2.0 Nov 25, 2016
@jlerche
Copy link

jlerche commented Nov 28, 2016

Re: separate thread writing to disk. Just need to be aware that due to the global interpreter lock threads created with the threading library aren't actually run in a separate os process, the multiprocessing library would have to be used to spawn a separate process, with the overhead of loading a copy of the yank python files into memory.

edit: well i'm fuzzy on what exactly gets copied into memory, whether it's the entire application or just the module that spawns the process.

@jchodera
Copy link
Member Author

Thanks! I've used MPI processes in the past to do this kind of thing with Python, but there are now plenty of options for truly parallel processing that avoids the GIL.

@jchodera
Copy link
Member Author

@Lnaden has implemented infrequent system checkpoints in #675!

@jchodera
Copy link
Member Author

jchodera commented Jun 1, 2017

Switching to g-BAOAB with hydrogen mass repartitioning is probably our easiest win in terms of getting a big speed boost for little effort. Once @maxentile's assessment of HMR is complete in a couple of weeks, we'll have an idea of what the optimal settings are here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants