-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analytical PCM 2nd derivative #302
base: master
Are you sure you want to change the base?
Conversation
henryw7
commented
Jan 10, 2025
•
edited
Loading
edited
- Change all finite difference implementation of PCM second derivative to analytical form.
- Bug fix in C-PCM gradient. The absolute error is at 1e-9 level for a gradient calculation, but will be magnified to 1e-4 level when computing hessian using finite difference.
- Bug fix in SS(V)PE gradient. SS(V)PE and IEF-PCM do not share gradient algorithm. We were misled by QChem 6.1 implementation.
ngrids = grids.shape[0] | ||
# n_charge_sum_per_thread = 1 # means every thread processes one pair and one grid | ||
# n_charge_sum_per_thread = ngrids # or larger number gaurantees one thread processes one pair and all grid points | ||
n_charge_sum_per_thread = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is n_charge_sum_per_thread hard-coded as 10?
ngrids = grids.shape[0] | ||
# n_charge_sum_per_thread = 1 # means every thread processes one pair and one grid | ||
# n_charge_sum_per_thread = ngrids # or larger number gaurantees one thread processes one pair and all grid points | ||
n_charge_sum_per_thread = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it hardcoded as 10?
vK_1_d2K_q += cupy.einsum('Adi,i,BDi->ABdD', vK_1_dot_dDdx, A, dSdx_dot_q) | ||
vK_1_d2K_q += cupy.einsum('Adi,BDi->ABdD', vK_1D_dot_dAdx, dSdx_dot_q) | ||
vK_1_d2K_q += cupy.einsum('Adi,BDi->BADd', vK_1_dot_dDdx, dAdx_dot_Sq) | ||
vK_1_d2K_q += cupy.einsum('Adi,i,BDi->BADd', vK_1_dot_dDdx, A, dSdx_dot_q) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split the 3-tensor contraction into 2 2-tensor contractions. Or simply make use of broaddcasting, A * dSdx_dot_q
matrix_d2S[i*n + j + n2 * 7] = 0.0; | ||
matrix_d2S[i*n + j + n2 * 8] = 0.0; | ||
} else { | ||
matrix_d2S[i*n + j ] = dx * dx * S_direct_product_prefactor + S_xyz_diagonal_prefactor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set S_direct_product_prefactor = 0 if i == j? The if-else logic may slow down GPU memory access.