Question about how to evaluate a trained model #12

huishan-cen · 2021-06-13T19:34:15Z

Hi, I recently started trying this repo and found it really cool!
I have managed to run the example in examples/example_scripts/train_model/ on some data and would like to use the final model to evaluate some other molecules. I know that the neuralxc sc ... command can do the testing if I provide a testing.traj.
However, I'd like to use the neuralxc eval ... command so I that I don't have to re-train the same model.
The --hdf5 argument requires the path to hdf5 file, baseline data, reference data. I assume the last one refers to a testing.traj like the one used with neuralxc sc ... in the example. However, I not sure what the first two files refer to and how to get them and couldn't find an example in the repo. Could you please give some advice or examples?

Moreover, I'm wondering how to set n_max and l_max as mentioned in the paper. I can't seem to find these options in the hyperparameters.json or the basis.json file.

The text was updated successfully, but these errors were encountered:

semodi · 2021-07-05T21:47:28Z

Hi,

The gist is that, once you have a fully trained model, you can use it in self-consistent calculations with PySCF to "evaluate some other molecules". The fastest/easiest way to do so would be to put all molecules you want to compute into a .xyz or .traj file and use neuralxc engine on it. Inside the configuration file, you need to add a keyword "nxc": "path_to_model" in the engine section to instruct PySCF to use your newly trained model.
I updated both the README file as well as the documentation in the last pull request. You should find a more detailed answer to your question in the Model deployment section.

As to the n_max and l_max, these options are only relevant if one is using a polynomial basis as we did in the paper. In the newer examples provided, we use a Gaussian (PySCF) basis set for which these options have no effect. Again sorry that this wasn't adressed in the previous docs but it should now be covered in the updated version here and here.

Let me know if that answers your question!

huishan-cen · 2021-07-14T23:43:59Z

Hi,
Thanks for the updated docs. I have managed to use the trained models with PySCF now. A couple of other questions:

The energies provided for water in the quickstart tutorials are all quite small. I assume these are target energies, i.e. the difference between CCSDT and DFT+PBE in eV, since the zero point energy of water should be around -76 hartree?
I have been trying to extract the descriptors c and d with the code. When I use the neuralxc pre command, I can see that the output is a numpy array stored in a .npy file. I think these are the unsymmetrised descriptors c of the structures in the input .xyz, however, I can't figure out the order of the array, i.e. which indices correspond to which atom or which n,l,m etc.
Can you confirm that the polynomial basis projector only available with SIESTA as QM engine? I get the impression that when using PySCF as the QM engine, only GTO basis projector is avaliable.
Thanks again for answering my questions.

semodi · 2021-10-16T17:46:45Z

Hi, to answer your questions:

For the water dataset, the monomer energies are in fact not taken from CCSDT calculations but from a highly accurate water monomer PES which sets the energy of the water monomer in its equilibrium geometry to zero. Right now NeuralXC doesn't really care about absolute energy differences but tries to get the relative energies between different conformers of the same molecule right.
The output is produced by three for loops like:

      for n in range(1, n_max+1):
            for l in range(n):
                for m in range(-l,l+1):
                      ...

This should clarify the ordering
3. Any basis can be used with both SIESTA and PySCF as QM engine, however keep in mind that the grid kind has to be set to the right value: euclidean grid for Siesta and radial grid for PySCF. If you encounter problems, please feel free to attach your error messages here so that I can help troubleshoot.

huishan-cen changed the title ~~Question about how to evaluate a model~~ Question about how to evaluate a trained model Jun 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about how to evaluate a trained model #12

Question about how to evaluate a trained model #12

huishan-cen commented Jun 13, 2021

semodi commented Jul 5, 2021

huishan-cen commented Jul 14, 2021 •

edited

Loading

semodi commented Oct 16, 2021 •

edited

Loading

Question about how to evaluate a trained model #12

Question about how to evaluate a trained model #12

Comments

huishan-cen commented Jun 13, 2021

semodi commented Jul 5, 2021

huishan-cen commented Jul 14, 2021 • edited Loading

semodi commented Oct 16, 2021 • edited Loading

huishan-cen commented Jul 14, 2021 •

edited

Loading

semodi commented Oct 16, 2021 •

edited

Loading