Skip to content

Commit

Permalink
Merge branch 'devel' into ase_db
Browse files Browse the repository at this point in the history
  • Loading branch information
njzjz committed Jul 9, 2021
2 parents cab15e4 + f0273f7 commit 325bf87
Show file tree
Hide file tree
Showing 186 changed files with 68,730 additions and 2,633 deletions.
25 changes: 25 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
name: Bug report
about: Create a bug report to help us eliminate issues and improve dpdata. If this
doesn’t look right, [choose a different type](https://github.com/deepmodeling/dpdata/issues/new/choose).
title: "[BUG] _Replace With Suitable Title_"
labels: bug
assignees: ''

---

**Summary**

<!--Please provide a clear and concise description of what the bug is.-->

<!--Please provide necessary information including the version of software and installation way, input file, running commands, error log , etc., AS DETAILED AS POSSIBLE to help locate and reproduce your problem. -->

<!--If applicable, specify what platform you are running on. -->

**Steps to Reproduce**

<!--Describe the steps required to (quickly) reproduce the issue. You can attach (small) files to the section below or add URLs where to download an archive with all necessary files. Please try to create an input set that is as minimal and small as possible and reproduces the bug as quickly as possible. **NOTE:** the less effort and time it takes to reproduce your reported bug, the more likely it becomes, that somebody will look into it and fix the problem.-->

**Further Information, Files, and Links**

<!--Put any additional information here, attach relevant text or image files and URLs to external sites, e.g. relevant publications-->
21 changes: 21 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
name: Feature request
about: Suggest an idea for this project. If this doesn’t work right, [choose a different
type]( https://github.com/deepmodeling/dpdata/issues/new/choose).
title: "[Feature Request] _Replace with Title_"
labels: enhancement
assignees: ''

---

**Summary**

<!--Please provide a brief and concise description of the suggested feature or change-->

**Detailed Description**

<!--Please explain how you would like to see dpdata enhanced, what feature(s) you are looking for, what specific data format this will support or what specific problems this will solve. If possible, provide references to relevant background information like publications or web pages, and whether you are planning to implement the enhancement yourself or would like to participate in the implementation. If applicable add a reference to an existing bug report or issue that this will address.-->

**Further Information, Files, and Links**

<!--Put any additional information here, attach relevant text or image files and URLs to external sites, e.g. relevant publications-->
17 changes: 17 additions & 0 deletions .github/ISSUE_TEMPLATE/generic-issue.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
name: Generic issue
about: For issues that do not fit any of the other categories. If this doesn’t work
right, [choose a different type]( https://github.com/deepmodeling/dpdata/issues/new/choose).
title: ''
labels: wontfix
assignees: ''

---

**Summary**

<!--Please provide a clear and concise description of what the question is.-->

**Details**

<!--Please explain the issue in detail here-->
21 changes: 21 additions & 0 deletions .github/ISSUE_TEMPLATE/request-for-help.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
name: Request for Help
about: Don't post help requests here, go to [discussions](https://github.com/deepmodeling/dpdata/discussions)
instead. If this doesn’t look right, choose a different type.
title: ''
labels: ''
assignees: ''

---

Before asking questions, you can

search the previous issues or discussions
check the [README](https://github.com/deepmodeling/dpdata/#readme).

Please **do not** post requests for help (e.g. with installing or using dpdata) here.
Instead go to [discussions](https://github.com/deepmodeling/dpdata/discussions).

This issue tracker is for tracking dpdata development related issues only.

Thanks for your cooperation.
28 changes: 0 additions & 28 deletions .github/workflows/docs.yml

This file was deleted.

30 changes: 30 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Python package

on:
- push
- pull_request

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.6, 3.7, 3.8]

steps:
- uses: actions/checkout@v2
# set up conda
- name: Set up Python ${{ matrix.python-version }}
uses: conda-incubator/setup-miniconda@v2
with:
auto-activate-base: true
activate-environment: ""
# install rdkit and openbabel
- name: Install rdkit
run: conda create -c conda-forge -n my-rdkit-env python=${{ matrix.python-version }} rdkit openbabel;
- name: Install dependencies
run: source $CONDA/bin/activate my-rdkit-env && pip install .[amber] coverage codecov
- name: Test
run: source $CONDA/bin/activate my-rdkit-env && cd tests && coverage run --source=../dpdata -m unittest && cd .. && coverage combine tests/.coverage && coverage report
- name: Run codecov
run: source $CONDA/bin/activate my-rdkit-env && codecov
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,5 @@ dist
dpdata.egg-info
_version.py
!tests/cp2k/aimd/cp2k.log
!tests/cp2k/restart_aimd/ch4.log
__pycache__
16 changes: 0 additions & 16 deletions .travis.yml

This file was deleted.

68 changes: 67 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,8 @@ The `System` or `LabeledSystem` can be constructed from the following file forma
| PWmat | movement | True | True | LabeledSystem | 'pwmat/movement' |
| PWmat | OUT.MLMD | True | True | LabeledSystem | 'pwmat/out.mlmd' |
| Amber | multi | True | True | LabeledSystem | 'amber/md' |
| Gromacs | gro | False | False | System | 'gromacs/gro' |
| Amber/sqm | sqm.out | False | False | System | 'sqm/out' |
| Gromacs | gro | True | False | System | 'gromacs/gro' |


The Class `dpdata.MultiSystems` can read data from a dir which may contains many files of different systems, or from single xyz file which contains different systems.
Expand Down Expand Up @@ -197,3 +198,68 @@ perturbed_system = dpdata.System('./POSCAR').perturb(pert_num=3,
atom_pert_style='normal')
print(perturbed_system.data)
```

## replace
By the following example, Random 8 Hf atoms in the system will be replaced by Zr atoms with the atom postion unchanged.
```python
s=dpdata.System('tests/poscars/POSCAR.P42nmc',fmt='vasp/poscar')
s.replace('Hf', 'Zr', 8)
s.to_vasp_poscar('POSCAR.P42nmc.replace')
```

# BondOrderSystem
A new class `BondOrderSystem` which inherits from class `System` is introduced in dpdata. This new class contains information of chemical bonds and formal charges (stored in `BondOrderSystem.data['bonds']`, `BondOrderSystem.data['formal_charges']`). Now BondOrderSystem can only read from .mol/.sdf formats, because of its dependency on rdkit (which means rdkit must be installed if you want to use this function). Other formats, such as pdb, must be converted to .mol/.sdf format (maybe with software like open babel).
```python
import dpdata
system_1 = dpdata.BondOrderSystem("tests/bond_order/CH3OH.mol", fmt="mol") # read from .mol file
system_2 = dpdata.BondOrderSystem("tests/bond_order/methane.sdf", fmt="sdf") # read from .sdf file
```
In sdf file, all molecules must be of the same topology (i.e. conformers of the same molecular configuration).
`BondOrderSystem` also supports initialize from a `rdkit.Chem.rdchem.Mol` object directly.
```python
from rdkit import Chem
from rdkit.Chem import AllChem
import dpdata

mol = Chem.MolFromSmiles("CC")
mol = Chem.AddHs(mol)
AllChem.EmbedMultipleConfs(mol, 10)
system = dpdata.BondOrderSystem(rdkit_mol=mol)
```

## Bond Order Assignment
The `BondOrderSystem` implements a more robust sanitize procedure for rdkit Mol, as defined in `dpdata.rdkit.santizie.Sanitizer`. This class defines 3 level of sanitization process by: low, medium and high. (default is medium).
+ low: use `rdkit.Chem.SanitizeMol()` function to sanitize molecule.
+ medium: before using rdkit, the programm will first assign formal charge of each atom to avoid inappropriate valence exceptions. However, this mode requires the rightness of the bond order information in the given molecule.
+ high: the program will try to fix inappropriate bond orders in aromatic hetreocycles, phosphate, sulfate, carboxyl, nitro, nitrine, guanidine groups. If this procedure fails to sanitize the given molecule, the program will then try to call `obabel` to pre-process the mol and repeat the sanitization procedure. **That is to say, if you wan't to use this level of sanitization, please ensure `obabel` is installed in the environment.**
According to our test, our sanitization procedure can successfully read 4852 small molecules in the PDBBind-refined-set. It is necessary to point out that the in the molecule file (mol/sdf), the number of explicit hydrogens has to be correct. Thus, we recommend to use
`obabel xxx -O xxx -h` to pre-process the file. The reason why we do not implement this hydrogen-adding procedure in dpdata is that we can not ensure its correctness.

```python
import dpdata

for sdf_file in glob.glob("bond_order/refined-set-ligands/obabel/*sdf"):
syst = dpdata.BondOrderSystem(sdf_file, sanitize_level='high', verbose=False)
```
## Formal Charge Assignment
BondOrderSystem implement a method to assign formal charge for each atom based on the 8-electron rule (see below). Note that it only supports common elements in bio-system: B,C,N,O,P,S,As
```python
import dpdata

syst = dpdata.BondOrderSystem("tests/bond_order/CH3NH3+.mol", fmt='mol')
print(syst.get_formal_charges()) # return the formal charge on each atom
print(syst.get_charge()) # return the total charge of the system
```

If a valence of 3 is detected on carbon, the formal charge will be assigned to -1. Because for most cases (in alkynyl anion, isonitrile, cyclopentadienyl anion), the formal charge on 3-valence carbon is -1, and this is also consisent with the 8-electron rule.

# Plugins

One can follow [a simple example](plugin_example/) to add their own format by creating and installing plugins. It's crirical to add the [Format](dpdata/format.py) class to `entry_points['dpdata.plugins']` in `setup.py`:
```py
entry_points={
'dpdata.plugins': [
'random=dpdata_random:RandomFormat'
]
},
```
13 changes: 13 additions & 0 deletions dpdata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,16 @@
from ._version import version as __version__
except ImportError:
from .__about__ import __version__

# BondOrder System has dependency on rdkit
try:
# prevent conflict with dpdata.rdkit
import rdkit as _
USE_RDKIT = True
except ModuleNotFoundError:
USE_RDKIT = False

if USE_RDKIT:
from .bond_order_system import BondOrderSystem


Loading

0 comments on commit 325bf87

Please sign in to comment.