Skip to content

Commit

Permalink
Unified error calculation
Browse files Browse the repository at this point in the history
  • Loading branch information
nerkulec committed Jul 17, 2024
1 parent d59a57f commit 4139713
Show file tree
Hide file tree
Showing 31 changed files with 804 additions and 881 deletions.
2 changes: 1 addition & 1 deletion docs/source/advanced_usage/hyperparameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ a physical validation metric such as

.. code-block:: python
parameters.running.after_before_training_metric = "band_energy"
parameters.running.after_training_metric = "band_energy"
Advanced optimization algorithms
********************************
Expand Down
3 changes: 2 additions & 1 deletion docs/source/advanced_usage/predictions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ Likewise, you can adjust the inference temperature via
calculator.data_handler.target_calculator.temperature = ...
.. _production_gpu:

Predictions on GPU
*******************

Expand Down Expand Up @@ -137,4 +139,3 @@ With the exception of the electronic density, which is saved into the ``.cube``
format for visualization with regular electronic structure visualization
software, all of these observables can be plotted with Python based
visualization libraries such as ``matplotlib``.

16 changes: 8 additions & 8 deletions docs/source/advanced_usage/trainingmodel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Specifically, when setting

.. code-block:: python
parameters.running.after_before_training_metric = "band_energy"
parameters.running.after_training_metric = "band_energy"
the error in the band energy between actual and predicted LDOS will be
calculated and printed before and after network training (in meV/atom).
Expand Down Expand Up @@ -205,21 +205,21 @@ visualization prior to training via
# 0: No visualizatuon, 1: loss and learning rate, 2: like 1,
# but additionally weights and biases are saved
parameters.running.visualisation = 1
parameters.running.visualisation_dir = "mala_vis"
parameters.running.logging = 1
parameters.running.logging_dir = "mala_vis"
where ``visualisation_dir`` specifies some directory in which to save the
MALA visualization data. Afterwards, you can run the training without any
where ``logging_dir`` specifies some directory in which to save the
MALA logging data. Afterwards, you can run the training without any
other modifications. Once training is finished (or during training, in case
you want to use tensorboard to monitor progress), you can launch tensorboard
via

.. code-block:: bash
tensorboard --logdir path_to_visualization
tensorboard --logdir path_to_log_directory
The full path for ``path_to_visualization`` can be accessed via
``trainer.full_visualization_path``.
The full path for ``path_to_log_directory`` can be accessed via
``trainer.full_logging_path``.


Training in parallel
Expand Down
4 changes: 2 additions & 2 deletions docs/source/basic_usage/hyperparameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,9 +118,9 @@ properties of the ``Parameters`` class:
during the optimization.
- ``network.layer_sizes``
- ``"int"``, ``"categorical"``
* - ``"trainingtype"``
* - ``"optimizer"``
- Optimization algorithm used during the NN optimization.
- ``running.trainingtype``
- ``running.optimizer``
- ``"categorical"``
* - ``"mini_batch_size"``
- Size of the mini batches used to calculate the gradient during
Expand Down
2 changes: 1 addition & 1 deletion docs/source/basic_usage/trainingmodel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ options to train a simple network with example data, namely
parameters.running.max_number_epochs = 100
parameters.running.mini_batch_size = 40
parameters.running.learning_rate = 0.00001
parameters.running.trainingtype = "Adam"
parameters.running.optimizer = "Adam"
parameters.verbosity = 1 # level of output; 1 is standard, 0 is low, 2 is debug.
Here, we can see that the ``Parameters`` object contains multiple
Expand Down
8 changes: 7 additions & 1 deletion docs/source/install/installing_lammps.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,24 @@ The MALA team recommends to build LAMMPS with ``cmake``. To do so
* ``Kokkos_ARCH_GPUARCH=???``: Your GPU architecture (see see `Kokkos instructions <https://docs.lammps.org/Build_extras.html#kokkos-package>`_)
* ``CMAKE_CXX_COMPILER=???``: Path to the ``nvcc_wrapper`` executable
shipped with the LAMMPS code, should be at ``/your/path/to/lammps/lib/kokkos/bin/nvcc_wrapper``
* For example, this configures the LAMMPS cmake build with Kokkos support

For example, this configures the LAMMPS cmake build with Kokkos support
for an Intel Haswell CPU and an Nvidia Volta GPU, with MPI support:

.. code-block:: bash
cmake ../cmake -D PKG_KOKKOS=yes -D BUILD_MPI=yes -D PKG_ML-SNAP=yes -D Kokkos_ENABLE_CUDA=yes -D Kokkos_ARCH_HSW=yes -D Kokkos_ARCH_VOLTA70=yes -D CMAKE_CXX_COMPILER=/path/to/lammps/lib/kokkos/bin/nvcc_wrapper -D BUILD_SHARED_LIBS=yes
.. note::
When using a GPU by setting ``parameters.use_gpu = True``, you *need* to
have a GPU version of ``LAMMPS`` installed. See :ref:`production_gpu` for
details.

* Build the library and executable with ``cmake --build .``
(Add ``--parallel=8`` for a faster build)



Installing the Python extension
********************************

Expand Down
23 changes: 15 additions & 8 deletions docs/source/install/installing_qe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,25 @@ Installing Quantum ESPRESSO (total energy module)
Prerequisites
*************

To run the total energy module, you need a full Quantum ESPRESSO installation,
for which to install the Python bindings. This module has been tested with
version ``7.2.``, the most recent version at the time of this release of MALA.
Newer versions may work (untested), but installation instructions may vary.
To build and run the total energy module, you need a full Quantum ESPRESSO
installation, for which to install the Python bindings. This module has been
tested with version ``7.2.``, the most recent version at the time of this
release of MALA. Newer versions may work (untested), but installation
instructions may vary.

Make sure you have an (MPI-aware) F90 compiler such as ``mpif90`` (e.g.
Debian-ish machine: ``apt install openmpi-bin``, on an HPC cluster something
like ``module load openmpi gcc``). Make sure to use the same compiler
for QE and the extension. This should be the default case, but if problems
arise you can manually select the compiler via
``--f90exec=`` in ``build_total_energy_energy_module.sh``
``--f90exec=`` in ``build_total_energy_module.sh``

We assume that QE's ``configure`` script will find your system libs, e.g. use
``-lblas``, ``-llapack`` and ``-lfftw3``. We use those by default in
``build_total_energy_energy_module.sh``. If you have, say, the MKL library,
``build_total_energy_module.sh``. If you have, say, the MKL library,
you may see ``configure`` use something like ``-lmkl_intel_lp64 -lmkl_sequential -lmkl_core``
when building QE. In this case you have to modify
``build_total_energy_energy_module.sh`` to use the same libraries!
``build_total_energy_module.sh`` to use the same libraries!

Build Quantum ESPRESSO
**********************
Expand All @@ -35,10 +36,16 @@ Build Quantum ESPRESSO
* Change to the ``external_modules/total_energy_module`` directory of the
MALA repository

.. note::
At the moment, building QE using ``cmake`` `doesn't work together with the
build_total_energy_module.sh script
<https://github.com/mala-project/mala/issues/468>`_. Please use the
``configure`` + ``make`` build workflow.

Installing the Python extension
********************************

* Run ``build_total_energy_energy_module.sh /path/to/your/q-e``.
* Run ``build_total_energy_module.sh /path/to/your/q-e``.

* If the build is successful, a file named something like
``total_energy.cpython-39m-x86_64-linux-gnu.so`` will be generated. This is
Expand Down
2 changes: 1 addition & 1 deletion examples/advanced/ex01_checkpoint_training.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def initial_setup():
parameters.running.max_number_epochs = 9
parameters.running.mini_batch_size = 8
parameters.running.learning_rate = 0.00001
parameters.running.trainingtype = "Adam"
parameters.running.optimizer = "Adam"

# We checkpoint the training every 5 epochs and save the results
# as "ex07".
Expand Down
4 changes: 2 additions & 2 deletions examples/advanced/ex03_tensor_board.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
parameters.running.max_number_epochs = 100
parameters.running.mini_batch_size = 40
parameters.running.learning_rate = 0.001
parameters.running.trainingtype = "Adam"
parameters.running.optimizer = "Adam"

# Turn the visualization on and select a folder to save the visualization
# files into.
Expand All @@ -45,6 +45,6 @@
trainer.train_network()
printout(
'Run finished, launch tensorboard with "tensorboard --logdir '
+ trainer.full_visualization_path
+ trainer.full_logging_path
+ '"'
)
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def initial_setup():
parameters.running.max_number_epochs = 10
parameters.running.mini_batch_size = 40
parameters.running.learning_rate = 0.00001
parameters.running.trainingtype = "Adam"
parameters.running.optimizer = "Adam"
parameters.hyperparameters.n_trials = 9
parameters.hyperparameters.checkpoints_each_trial = 5
parameters.hyperparameters.checkpoint_name = "ex05_checkpoint"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
parameters.running.max_number_epochs = 5
parameters.running.mini_batch_size = 40
parameters.running.learning_rate = 0.00001
parameters.running.trainingtype = "Adam"
parameters.running.optimizer = "Adam"
parameters.hyperparameters.n_trials = 10
parameters.hyperparameters.checkpoints_each_trial = -1
parameters.hyperparameters.checkpoint_name = "ex06"
Expand All @@ -44,7 +44,7 @@
parameters.targets.ldos_gridspacing_ev = 2.5
parameters.targets.ldos_gridoffset_ev = -5
parameters.hyperparameters.number_training_per_trial = 3
parameters.running.after_before_training_metric = "band_energy"
parameters.running.after_training_metric = "band_energy"

data_handler = mala.DataHandler(parameters)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def optimize_hyperparameters(hyper_optimizer):
parameters.running.max_number_epochs = 10
parameters.running.mini_batch_size = 40
parameters.running.learning_rate = 0.00001
parameters.running.trainingtype = "Adam"
parameters.running.optimizer = "Adam"
parameters.hyperparameters.n_trials = 8
parameters.hyperparameters.hyper_opt_method = hyper_optimizer

Expand Down Expand Up @@ -64,7 +64,7 @@ def optimize_hyperparameters(hyper_optimizer):
data_handler.output_dimension,
]
hyperoptimizer.add_hyperparameter(
"categorical", "trainingtype", choices=["Adam", "SGD"]
"categorical", "optimizer", choices=["Adam", "SGD"]
)
hyperoptimizer.add_hyperparameter(
"categorical", "layer_activation_00", choices=["ReLU", "Sigmoid"]
Expand Down
2 changes: 1 addition & 1 deletion examples/basic/ex01_train_network.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
parameters.running.max_number_epochs = 100
parameters.running.mini_batch_size = 40
parameters.running.learning_rate = 0.00001
parameters.running.trainingtype = "Adam"
parameters.running.optimizer = "Adam"
# These parameters characterize how the LDOS and bispectrum descriptors
# were calculated. They are _technically_ not needed to train a simple
# network. However, it is useful to define them prior to training. Then,
Expand Down
6 changes: 3 additions & 3 deletions examples/basic/ex02_test_network.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,15 @@
# It is recommended to enable the "lazy-loading" feature, so that
# data is loaded into memory one snapshot at a time during testing - this
# helps keep RAM requirement down. Furthermore, you have to decide which
# observables to test (usual choices are "band_energy", "total_energy" and
# "number_of_electrons") and whether you want the results per snapshot
# observables to test (usual choices are "band_energy", "total_energy")
# and whether you want the results per snapshot
# (output_format="list") or as an averaged value (output_format="mae")
####################

parameters, network, data_handler, tester = mala.Tester.load_run(
run_name=model_name, path=model_path
)
tester.observables_to_test = ["band_energy", "number_of_electrons"]
tester.observables_to_test = ["band_energy", "density"]
tester.output_format = "list"
parameters.data.use_lazy_loading = True

Expand Down
2 changes: 1 addition & 1 deletion examples/basic/ex04_hyperparameter_optimization.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
parameters.data.output_rescaling_type = "normal"
parameters.running.max_number_epochs = 20
parameters.running.mini_batch_size = 40
parameters.running.trainingtype = "Adam"
parameters.running.optimizer = "Adam"
parameters.hyperparameters.n_trials = 20

####################
Expand Down
Loading

0 comments on commit 4139713

Please sign in to comment.