Update the hyperopt doc and script

1. Fix typos 2. Improve the warning message display 3. Add authors
lanl · Sep 19, 2024 · 4635c4c · 4635c4c
1 parent 54a5bc3
commit 4635c4c
Show file tree

Hide file tree

Showing 3 changed files with 42 additions and 30 deletions.
diff --git a/docs/source/examples/hyperopt.rst b/docs/source/examples/hyperopt.rst
@@ -1,8 +1,8 @@
 Hyperparameter optimization with Ax and Ray
 ===========================================
 
-Here is an example on how you can perform hyperparameter optimization
-sequentially (with Ax) or parallelly (with Ax and Ray).
+Here is an example of how you can perform hyperparameter optimization
+sequentially (with Ax) or in parallel (with Ax and Ray).
 
 Prerequisites
 -------------
@@ -31,12 +31,13 @@ Typical workflow
 ----------------
 
 Ax is a package that can perform Bayesian optimization. With the given parameter
-range, a set of initial trials are generated. Then based on the metrics returned
+range, a set of initial trials is generated. Then based on the metrics returned
 from these trials, new test parameters are generated. By default, this Ax
 workflow can only be performed sequentially. We can combine Ray and Ax to
-utilize multiple GPU on the same node. Ray interfaces with Ax to pull trial
+utilize multiple GPUs on the same node. Ray interfaces with Ax to pull trial
 parameters and then automatically distribute the trials to available resources.
-With this, we can perform asynchronous parallelized hyperparameter optimization.
+With this, we can perform an asynchronous parallelized hyperparameter
+optimization.
 
 
 Create an Ax experiment
@@ -93,7 +94,7 @@ A few crucial things to note:
 * You can give a range, choice, or fixed value to each parameter. You might want
   to specify the data type as well. A fixed parameter makes sense here because
   you can do the optimization with only a subset of parameters without the need
-  of modifying your training function.
+  to modify your training function.
 * Constraints can be applied to the search space like the example shows, but
   there is no easy way to achieve a constraint that contains mathematical
   expressions (for example, `parameter_a < 2 * parameter_b`).
@@ -128,7 +129,7 @@ Training function
 ^^^^^^^^^^^^^^^^^
 
 You only need a minimal change to your existing training script to use it with
-Ax. In most case, you just have to wrap the whole script into a function
+Ax. In most cases, you just have to wrap the whole script into a function
 
 .. code-block:: python
 
@@ -161,13 +162,13 @@ Ax. In most case, you just have to wrap the whole script into a function
 
 Note how we can utilize the parameters passed in and return **Metric** at the
 end. Apparently, we have the freedom to choose different metrics to return here.
-We can even use mathematically expressions to combine some metrics together.
+We can even use mathematical expressions to combine some metrics together.
 
 .. note::
    Ax does NOT create a directory for a trial. If your training function does
    not take care of the working directory, all results will be saved into the
-   same folder, i.e., `cwd`. To avoid this, the training function need create an
-   unique path for each trial. In this example, we use the `trial_index` to
+   same folder, i.e., `cwd`. To avoid this, the training function needs to create
+   a unique path for each trial. In this example, we use the `trial_index` to
    achieve this purpose. With Ray, this step is NOT necessary.
 
 .. _run-sequential-experiments:
@@ -182,16 +183,16 @@ Next, we can run the experiments
     for k in range(30):
         parameter, trial_index = ax_client.get_next_trial()
         ax_client.complete_trial(trial_index=trial_index, raw_data=training(trial_index, **parameter))
-        # Save experiment to file as JSON file
+        # Save the experiment as a JSON file
         ax_client.save_to_json_file(filepath="hyperopt.json")
     data_frame = ax_client.get_trials_data_frame().sort_values("Metric")
     data_frame.to_csv("hyperopt.csv", header=True)
 
 For example, we will run 30 trials here and the results will be saved into a
-json file and a csv file. The json file will contain all the details of the
-trials, which can be used to restart the experiments or add additional
-experiments. As it contains too many details to be human-friendly, we save a
-more human-friendly csv that only contains the trial indices, parameters, and
+json file and a CSV file. The JSON file will contain all the details of the
+trials, which can be used to restart the experiment or add additional trials to
+the experiment. As it contains too many details to be human-friendly, we save a
+more human-friendly CSV that only contains the trial indices, parameters, and
 metrics.
 
 Asynchronous parallelized optimization with Ray
@@ -268,7 +269,7 @@ Advanced details
 Relative import
 """""""""""""""
 
-If you save the training function into a separated file and import it into the
+If you save the training function into a separate file and import it into the
 Ray script, one line has to be added before the trials start,
 
 .. code-block:: python
@@ -285,26 +286,26 @@ Callbacks for Ray
 When running `ray.tune`, a set of callback functions can be called during the
 process. Ray has a `documentation`_ on the callback functions. You can build
 your own for your convenience. However, here is a callback function to save
-the json and csv files at the end of each trial and handle failed trials, which
+the JSON and CSV files at the end of each trial and handle failed trials, which
 should cover the most basic functionalities.
 
 .. code-block:: python
 
     from ray.tune.logger import JsonLoggerCallback, LoggerCallback
     
     class AxLogger(LoggerCallback):
-        def __init__(self, ax_client: AxClient, json_name: str, csv_name: str):
+        def __init__(self, ax_client: AxClient, JSON_name: str, csv_name: str):
             """
-            A logger callback to save the progress to json file after every trial ends.
+            A logger callback to save the progress to a JSON file after every trial ends.
             Similar to running `ax_client.save_to_json_file` every iteration in sequential
             searches.
     
             Args:
                 ax_client (AxClient): ax client to save
-                json_name (str): name for the json file. Append a path if you want to save the \
-                    json file to somewhere other than cwd.
-                csv_name (str): name for the csv file. Append a path if you want to save the \
-                    csv file to somewhere other than cwd.
+                json_name (str): name for the JSON file. Append a path if you want to save the \
+                    JSON file to somewhere other than cwd.
+                csv_name (str): name for the CSV file. Append a path if you want to save the \
+                    CSV file to somewhere other than cwd.
             """
             self.ax_client = ax_client
             self.json = json_name
@@ -383,11 +384,11 @@ Restart/extend an experiment
 """"""""""""""""""""""""""""
 
 .. note::
-   Due to the complexity in handling the individual trial path with Ray, it is
+   Due to the complexity of handling the individual trial path with Ray, it is
    not possible to restart unfinished trials at this moment.
 
-Restarting an experiment or adding additional trials to an experiment share the
-same workflow. The key is the json file saved from the experiment. To reload the 
+Restarting an experiment or adding additional trials to an experiment shares the
+same workflow. The key is the JSON file saved from the experiment. To reload the 
 experiment state:
 
 .. code-block:: python
@@ -403,7 +404,7 @@ this experiment, simply increase `num_samples` in `ray.tune.TuneConfig`:
     # this will end the experiment when 20 trials are finished
     tune_config=tune.TuneConfig(search_alg=algo, num_samples=20)
 
-Sometime, you may want to make changes to the experiment itself when reloading
+Sometimes, you may want to make changes to the experiment itself when reloading
 the experiment, for example, the search space. This can easily achieved by
 
 .. code-block:: python
@@ -441,10 +442,14 @@ after the `ax_client` object is reloaded.
    If the original experiment is not created with this option, there is not much
    we can do.
 
-The example scripts with a modified QM7 training are provided in `examples`_.
+The example scripts with a modified QM7 training script are provided in
+`examples`_. This tutorial is contributed by `Xinyang Li`_ and the examples
+scripts are developed by `Sakib Matin`_ and `Xinyang Li`_.
 
 .. _ray: https://docs.ray.io/en/latest/
 .. _Ax: https://github.com/facebook/Ax
 .. _issue: https://github.com/facebook/Ax/issues/2711
 .. _documentation: https://docs.ray.io/en/latest/tune/tutorials/tune-metrics.html
 .. _examples: https://github.com/lanl/hippynn/tree/development/examples/hyperparameter_optimization
+.. _Xinyang Li: https://github.com/tautomer
+.. _Sakib Matin: https://github.com/sakibmatin
diff --git a/examples/hyperparameter_optimization/ax_opt.py b/examples/hyperparameter_optimization/ax_opt.py
@@ -19,6 +19,7 @@
 """
     B-Opt tuning for HIPNN using AX.
 
+    Originally developed by Sakib Matin (LANL) and modified by Xinyang Li (LANL).
 """
 
 import json
@@ -31,7 +32,11 @@
 import hippynn
 
 warnings.warn(
-    """Make sure to modify the dataset path in QM7_ax_example.py before running this example. For this test (Ax sequential optimization), a relative path can be used. The work directory for each trial will be ./test_ax/index. If the dataset is in ./dataset, the relative path should be ../../dataset. However, an absolute path is strongly recommended."""
+    "\nMake sure to modify the dataset path in QM7_ax_example.py before running this example.\n"
+    "For this test (Ax sequential optimization), a relative path can be used.\n"
+    "The work directory for each trial will be ./test_ax/index.\n"
+    "If the dataset is in ./dataset, the relative path should be ../../dataset.\n"
+    "However, an absolute path is strongly recommended."
 )
 
 

diff --git a/examples/hyperparameter_optimization/ax_opt_ray.py b/examples/hyperparameter_optimization/ax_opt_ray.py
@@ -22,6 +22,7 @@
 """
     Hyperparameter tuning for HIPNN using AX and Ray.
 
+    Developed by Xinyang Li (LANL) based on the Ax script.
 """
 
 import shutil
@@ -141,7 +142,8 @@ def calculate_runtime(cls, trial: AXTrial):
 if __name__ == "__main__":
 
     warnings.warn(
-        """Make sure to modify the dataset path in QM7_ax_example.py before running this example. For this test (Ray parallelized optimization), you MUST provide an absolute path to the dataset"""
+        "\nMake sure to modify the dataset path in QM7_ax_example.py before running this example.\n"
+        "For this test (Ray parallelized optimization), you MUST provide an absolute path to the dataset."
     )
 
     if len(sys.argv) == 2: