diff --git a/docs/source/examples/hyperopt.rst b/docs/source/examples/hyperopt.rst index 7846ac54..80fcb7c1 100644 --- a/docs/source/examples/hyperopt.rst +++ b/docs/source/examples/hyperopt.rst @@ -1,8 +1,8 @@ Hyperparameter optimization with Ax and Ray =========================================== -Here is an example on how you can perform hyperparameter optimization -sequentially (with Ax) or parallelly (with Ax and Ray). +Here is an example of how you can perform hyperparameter optimization +sequentially (with Ax) or in parallel (with Ax and Ray). Prerequisites ------------- @@ -31,12 +31,13 @@ Typical workflow ---------------- Ax is a package that can perform Bayesian optimization. With the given parameter -range, a set of initial trials are generated. Then based on the metrics returned +range, a set of initial trials is generated. Then based on the metrics returned from these trials, new test parameters are generated. By default, this Ax workflow can only be performed sequentially. We can combine Ray and Ax to -utilize multiple GPU on the same node. Ray interfaces with Ax to pull trial +utilize multiple GPUs on the same node. Ray interfaces with Ax to pull trial parameters and then automatically distribute the trials to available resources. -With this, we can perform asynchronous parallelized hyperparameter optimization. +With this, we can perform an asynchronous parallelized hyperparameter +optimization. Create an Ax experiment @@ -93,7 +94,7 @@ A few crucial things to note: * You can give a range, choice, or fixed value to each parameter. You might want to specify the data type as well. A fixed parameter makes sense here because you can do the optimization with only a subset of parameters without the need - of modifying your training function. + to modify your training function. * Constraints can be applied to the search space like the example shows, but there is no easy way to achieve a constraint that contains mathematical expressions (for example, `parameter_a < 2 * parameter_b`). @@ -128,7 +129,7 @@ Training function ^^^^^^^^^^^^^^^^^ You only need a minimal change to your existing training script to use it with -Ax. In most case, you just have to wrap the whole script into a function +Ax. In most cases, you just have to wrap the whole script into a function .. code-block:: python @@ -161,13 +162,13 @@ Ax. In most case, you just have to wrap the whole script into a function Note how we can utilize the parameters passed in and return **Metric** at the end. Apparently, we have the freedom to choose different metrics to return here. -We can even use mathematically expressions to combine some metrics together. +We can even use mathematical expressions to combine some metrics together. .. note:: Ax does NOT create a directory for a trial. If your training function does not take care of the working directory, all results will be saved into the - same folder, i.e., `cwd`. To avoid this, the training function need create an - unique path for each trial. In this example, we use the `trial_index` to + same folder, i.e., `cwd`. To avoid this, the training function needs to create + a unique path for each trial. In this example, we use the `trial_index` to achieve this purpose. With Ray, this step is NOT necessary. .. _run-sequential-experiments: @@ -182,16 +183,16 @@ Next, we can run the experiments for k in range(30): parameter, trial_index = ax_client.get_next_trial() ax_client.complete_trial(trial_index=trial_index, raw_data=training(trial_index, **parameter)) - # Save experiment to file as JSON file + # Save the experiment as a JSON file ax_client.save_to_json_file(filepath="hyperopt.json") data_frame = ax_client.get_trials_data_frame().sort_values("Metric") data_frame.to_csv("hyperopt.csv", header=True) For example, we will run 30 trials here and the results will be saved into a -json file and a csv file. The json file will contain all the details of the -trials, which can be used to restart the experiments or add additional -experiments. As it contains too many details to be human-friendly, we save a -more human-friendly csv that only contains the trial indices, parameters, and +json file and a CSV file. The JSON file will contain all the details of the +trials, which can be used to restart the experiment or add additional trials to +the experiment. As it contains too many details to be human-friendly, we save a +more human-friendly CSV that only contains the trial indices, parameters, and metrics. Asynchronous parallelized optimization with Ray @@ -268,7 +269,7 @@ Advanced details Relative import """"""""""""""" -If you save the training function into a separated file and import it into the +If you save the training function into a separate file and import it into the Ray script, one line has to be added before the trials start, .. code-block:: python @@ -285,7 +286,7 @@ Callbacks for Ray When running `ray.tune`, a set of callback functions can be called during the process. Ray has a `documentation`_ on the callback functions. You can build your own for your convenience. However, here is a callback function to save -the json and csv files at the end of each trial and handle failed trials, which +the JSON and CSV files at the end of each trial and handle failed trials, which should cover the most basic functionalities. .. code-block:: python @@ -293,18 +294,18 @@ should cover the most basic functionalities. from ray.tune.logger import JsonLoggerCallback, LoggerCallback class AxLogger(LoggerCallback): - def __init__(self, ax_client: AxClient, json_name: str, csv_name: str): + def __init__(self, ax_client: AxClient, JSON_name: str, csv_name: str): """ - A logger callback to save the progress to json file after every trial ends. + A logger callback to save the progress to a JSON file after every trial ends. Similar to running `ax_client.save_to_json_file` every iteration in sequential searches. Args: ax_client (AxClient): ax client to save - json_name (str): name for the json file. Append a path if you want to save the \ - json file to somewhere other than cwd. - csv_name (str): name for the csv file. Append a path if you want to save the \ - csv file to somewhere other than cwd. + json_name (str): name for the JSON file. Append a path if you want to save the \ + JSON file to somewhere other than cwd. + csv_name (str): name for the CSV file. Append a path if you want to save the \ + CSV file to somewhere other than cwd. """ self.ax_client = ax_client self.json = json_name @@ -383,11 +384,11 @@ Restart/extend an experiment """""""""""""""""""""""""""" .. note:: - Due to the complexity in handling the individual trial path with Ray, it is + Due to the complexity of handling the individual trial path with Ray, it is not possible to restart unfinished trials at this moment. -Restarting an experiment or adding additional trials to an experiment share the -same workflow. The key is the json file saved from the experiment. To reload the +Restarting an experiment or adding additional trials to an experiment shares the +same workflow. The key is the JSON file saved from the experiment. To reload the experiment state: .. code-block:: python @@ -403,7 +404,7 @@ this experiment, simply increase `num_samples` in `ray.tune.TuneConfig`: # this will end the experiment when 20 trials are finished tune_config=tune.TuneConfig(search_alg=algo, num_samples=20) -Sometime, you may want to make changes to the experiment itself when reloading +Sometimes, you may want to make changes to the experiment itself when reloading the experiment, for example, the search space. This can easily achieved by .. code-block:: python @@ -441,10 +442,14 @@ after the `ax_client` object is reloaded. If the original experiment is not created with this option, there is not much we can do. -The example scripts with a modified QM7 training are provided in `examples`_. +The example scripts with a modified QM7 training script are provided in +`examples`_. This tutorial is contributed by `Xinyang Li`_ and the examples +scripts are developed by `Sakib Matin`_ and `Xinyang Li`_. .. _ray: https://docs.ray.io/en/latest/ .. _Ax: https://github.com/facebook/Ax .. _issue: https://github.com/facebook/Ax/issues/2711 .. _documentation: https://docs.ray.io/en/latest/tune/tutorials/tune-metrics.html .. _examples: https://github.com/lanl/hippynn/tree/development/examples/hyperparameter_optimization +.. _Xinyang Li: https://github.com/tautomer +.. _Sakib Matin: https://github.com/sakibmatin diff --git a/examples/hyperparameter_optimization/ax_opt.py b/examples/hyperparameter_optimization/ax_opt.py index c2260712..08317f8a 100644 --- a/examples/hyperparameter_optimization/ax_opt.py +++ b/examples/hyperparameter_optimization/ax_opt.py @@ -19,6 +19,7 @@ """ B-Opt tuning for HIPNN using AX. + Originally developed by Sakib Matin (LANL) and modified by Xinyang Li (LANL). """ import json @@ -31,7 +32,11 @@ import hippynn warnings.warn( - """Make sure to modify the dataset path in QM7_ax_example.py before running this example. For this test (Ax sequential optimization), a relative path can be used. The work directory for each trial will be ./test_ax/index. If the dataset is in ./dataset, the relative path should be ../../dataset. However, an absolute path is strongly recommended.""" + "\nMake sure to modify the dataset path in QM7_ax_example.py before running this example.\n" + "For this test (Ax sequential optimization), a relative path can be used.\n" + "The work directory for each trial will be ./test_ax/index.\n" + "If the dataset is in ./dataset, the relative path should be ../../dataset.\n" + "However, an absolute path is strongly recommended." ) diff --git a/examples/hyperparameter_optimization/ax_opt_ray.py b/examples/hyperparameter_optimization/ax_opt_ray.py index 76b34944..a5acb324 100644 --- a/examples/hyperparameter_optimization/ax_opt_ray.py +++ b/examples/hyperparameter_optimization/ax_opt_ray.py @@ -22,6 +22,7 @@ """ Hyperparameter tuning for HIPNN using AX and Ray. + Developed by Xinyang Li (LANL) based on the Ax script. """ import shutil @@ -141,7 +142,8 @@ def calculate_runtime(cls, trial: AXTrial): if __name__ == "__main__": warnings.warn( - """Make sure to modify the dataset path in QM7_ax_example.py before running this example. For this test (Ray parallelized optimization), you MUST provide an absolute path to the dataset""" + "\nMake sure to modify the dataset path in QM7_ax_example.py before running this example.\n" + "For this test (Ray parallelized optimization), you MUST provide an absolute path to the dataset." ) if len(sys.argv) == 2: