Skip to content

Protonation state sampling

John Chodera edited this page Sep 8, 2024 · 22 revisions

Strategies for constant-pH simulation for atomistic potentials without explicit bonded terms

Theory

Consider a system at equilibrium in a semigrand ensemble relevant for constant-pH simulation.

The thermodynamic parameters $\theta \equiv \{ \beta, p, \mu_H \}$ denote the external inverse thermal energy $\beta = (k_B T)^{-1}$, external pressure $p$, and external proton chemical potential $\mu_H$.

Denote the instantaneous configuration of the system by $x$, where the relevant scalar properties conjugate to the thermodynamic parameters are the potential energy $U(x)$, the instantaneous volume $V(x)$, and the instantaneous number of protons $N_H(x)$.

We can write the reduced potential $u(x; \theta)$ for this ensemble as

$$ u(x; \theta) \equiv \beta U(x) + p V(x) + \mu_H N_H(x) $$

(NOTE: The inverse thermal energy $\beta$ has been absorbed into $p$ and $\mu_H$ in this equation, which differs from how we traditionally write the reduced potential.)

The probability density of interest is given by

$$ \pi(x; \theta) = Z(\theta)^{-1} q(x; \theta) $$

where the unnormalized probability density $q(x; \theta)$ is given by the Boltzmann law

$$ q(x; \theta) \equiv e^{-u(x; \theta)} $$

and the partition function $Z(\theta)$ is given by

$$ Z(\theta) \equiv \sum_{N_H=1}^{\infty} \int dx q(x_{N_H}; \theta) $$

where we have written the explicit subscript $x_{N_H}$ to specify that there are exactly $N_H$ protons in the system.

Algorithms

For simplicity, we first consider algorithms for dealign with tautomerization. These algorithms can be easily adapted to handle protonation states in a straightforward manner.

Consider the tautomerization of imidazole

image

(TODO: Replace this with a better figure)

Instantaneous Monte Carlo with heavy atom centered Gaussians

The simplest approach we will consider is an instantaneous Metropolis Monte Carlo proposal where we propose to "hop" the proton from one titratable heavy atom site to another.

  • Step 1: Propose $x_{new} \sim p(x_{new} | x_{old})$ where the proton at position 1 is moved to a position that is a Gaussian random variable about the N at position 3.
  • Step 2: Accept or reject the move with the Metropolis-Hastings criteria

$$ P_{accept} = \min \{ 1, \frac{p(x_{old} | x_{new})}{p(x_{new} | x_{old})} \frac{\pi(x_{new})}{\pi(x_{old})} \} $$

where we can use the difference in reduced potentials $\Delta u \equiv u(x_{new}) - u_(x_{old})$ to write

$$ P_{accept} = \min \{ 1, \frac{p(x_{old} | x_{new})}{p(x_{new} | x_{old})} e^{-\Delta u} \} $$

Instantaneous Monte Carlo with improved proposal densities

We expect the average acceptance rate of the instantaneous heavy atom centered Monte Carlo proposal $< P_{accept} >$ to be low in both vacuum and solvent. In particular, the proposal ratio $r \equiv \frac{p(x_{old} | x_{new})}{p(x_{new} | x_{old})}$ will generally be low because the probability of generating the old proton placement $p(x_{old} | x_{new})$ will be low for this poor choice of proposal probability.

Instead, we can significantly improve the choice of proposal probability by learning the parameters of the Gaussian distribution in terms of the heavy atom and its connected neighbors. Suppose we have an arrangement $X-NH-Y$ within a molecule and we have learned deterministic functions for the displacement $\Delta \mu(x)$ and standard deviation $\sigma(x)$ of the position of $H$ given $X-N-Y$ from equilibrium simulations through some manner of equivariant neural network. We can use these in a Gaussian proposal density to sample the new proton position

$$ x_H \sim \mathcal{N}(x_N ; \Delta \mu(x), \sigma(x)) $$

where $x_H$ is the coordinates of the proton H being placed and $x_N$ is the coordinates of the heavy atom $N$ it is being placed near.

We expect this should significantly increase the acceptance rate $< P_{accept} >$ by improving the proposal probability ratio $r$ in vacuum.

Parallel instantaneous Monte Carlo proposals

Even with improved proton proposal distributions, in solvent, instantaneous placement of protons without allowing relaxation of the surrounding solvent will likely lead to low acceptance rates $< P_{accept} >$ due to the probability density ratio $s = \frac{\pi(x_{new})}{\pi(x_{old})}$ appearing in the Metropolis-Hastings acceptance probability.

A very simple way to improve this is to perform parallel proposals: If we propose $M$ moves in parallel, $x_{new}^{(1)}, \ldots, x_{new}^{(M)}$, we can modify the acceptance criteria to select either the old position $x_{old}$ or one of the new proposed positions via parallel Monte Carlo. Here, we select among the list of potential configurations $y \in \{ x_{old}, x_{new}^{(1)}, \ldots, x_{new}^{(M)} \}$ using the symmetric rule:

$$ P_{accept}(y) \propto p(y | x_{old}) \pi(y) $$

(TODO: Check that we have correctly included the proposal probabilities $p(y | x_{old})$.)

In the limit of $< P_{accept} > << 1/M$, this approach will give near-linear improvements in acceptance probability in $M$. This comes at the cost of $M$ energy evaluations, but it is possible that a very wide GPU can enable $M$ energies to be computed in parallel at a cost of $L < M$.

This idea is succinctly described in Waste recycling Monte Carlo.

Nonequilibrium candidate Monte Carlo (NCMC) proposals : transient augmentation

It is possible to achieve superlinear improvements in acceptance probability $< P_{accept} >$ through the use of Nonequilibrium Candidate Monte Carlo (NCMC).

There are numerous ways to use NCMC for this scenario. First, we consider building on the previous algorithms that draw a new proton position using one of the hydrogen position proposal probabilities $p(x_H | x)$ above in a scheme that temporarily augments the system with a new proton in addition to the old proton, switching the old proton interactions off as the new proton is switched on.

First, partition the atoms into the proton under consideration for transport---$x_{H, old}$---and all other protons---$x_{core}. We draw the position of a new additional proton $x_{H, new}$:

$$ x_{H, new}^{(0)} \sim p(x_{H} | x_{old}) $$

We then integrate $t = 1, \ldots, T$ steps of molecular dynamics using the alchemical potential energy function $U_t(x)$ given by an interpolative alchemical energy function such as

$$ U_t(x) = (1-\frac{t}{T}) U(x_{H, old}, x_{core}) + \frac{t}{T} U(x_{H, new}, x_{core}) $$

where $x_{core}$ denotes all atoms except the proton being hopped, and integrate the dimensionless work $w$, taking care to properly compute the work appropriate for the integrator being used (see NCMC). For example, if velocity Verlet integration is used, the dimensionless work $w$ is equivalent to the protocol work

$$ w = \sum_{t=1}^T [ u_{t}(x_t) - u_{t-1}(x_t) ]

The acceptance probability for $x_T$ is then given by replacing the reduced potential difference $\Delta u$ in the Metropolis-Hastings criteria with the work $w$:

$$ P_{accept} = \min \{ 1, \frac{p(x_{H, old, T} | x_T)}{p(x_{H, new, 0} | x_0)} e^{-\Delta w} \} $$

If rejected, however, the entire trajectory $x_{1}, \ldots, x_{T}$ must be discarded.

The velocity will also need to be reversed on acceptance or rejection to ensure the correct distribution is maintained---see NCMC for more details. If an MCMC scheme is used where protonation state moves and MD moves that begin with a new velocity draw from the Maxwell-Boltzmann distribution, this can be ignored since the velocity will be refreshed immediately afterwards anyway.

Nonequilibrium candidate Monte Carlo (NCMC) proposals : displacement

As an alternative to transiently augmenting the number of particles, we can instead propose a displacement $\Delta x$ that intends to transport the proton from one likely proton location to another, using an alchemical protocol similar to the Alchemical Transfer Method (ATM):

$$ \Delta x \sim p(\Delta x | x_0) $$

where the alchemical energy function becomes

$$ U_t(x) = (1-\frac{t}{T}) U(x) + \frac{t}{T} U(x + \Delta x) $$

The acceptance probability replaces the proposal ratio $r$ with one involving $\Delta x$:

$$ P_{accept} = \min \{ 1, \frac{p(-\Delta x | x_T)}{p(\Delta x | x_0)} e^{-\Delta w} \} $$

Nonequilibrium candidate Monte Carlo (NCMC) proposals : proton wires

Excess protons are surprisingly mobile in bulk water. Water wires form spontaneously and enable the rapid nonlocal translocation of excess protons. In confined geometries, proton transport over large distances can occur on the sub-picosecond timescale.

As a result, we may not need to be overly concerned about where a proton is initially placed if we use NCMC to transport or insert/delete a proton---spontaneous water wires may effectively shuttle protons to/from relevant locations on their own. Whether this timescale is practical, of course, is a question for the Experiments section below.

Suppose we select a position for placing the proton that is uniform within the simulation box:

$$ p(x_H | x) = 1/V(x) $$

If we again integrate an NCMC switching protocol over $T$ steps as the old proton is switched off and the new proton switched on, the acceptance probability simply becomes

$$ P_{accept} = \min \{ 1, \frac{1/V(x_T)}{1/V(x_0)} e^{-\Delta w} \} $$

The acceptance probability will increase superlinearly in $T$ until some timescale commensurate with proton transport within the system, when it will begin to increase sub-linearly with $T$.

Useful experiments

Clone this wiki locally