Merge pull request #362 from autonomio/daily-dev

v.0.6.2 to Dev
autonomio · Aug 3, 2019 · 4bb792a · 4bb792a
2 parents 8f38a4a + 1f7cc71
commit 4bb792a
Show file tree

Hide file tree

Showing 95 changed files with 6,014 additions and 1,281 deletions.
diff --git a/README.md b/README.md
@@ -61,9 +61,9 @@ Based on what no doubt constitutes a "biased" review (being our own) of more tha
 - model generalization evaluator
 - experiment analytics
 - Random search
+- Pseudo, Quasi, and Quantum Random optimizers
 - Grid search
-- Correlation based optimization
-- Pseudo, Quasi, and Quantum Random functions
+- Probabilistic optimization
 - Model candidate generality evaluation
 - Live training monitor
 - Experiment analytics

diff --git a/test/core_tests/__init__.py → docs/.nojekyll b/test/core_tests/__init__.py → docs/.nojekyll
diff --git a/docs/Analyze.md b/docs/Analyze.md
@@ -0,0 +1,61 @@
+# Analyze (previously Reporting)
+
+The experiment results can be analyzed through the [Analyze()](https://github.com/autonomio/talos/blob/master/talos/utils/reporting.py) utility. `Analyze()` may be used after Scan completes, or during an experiment (from a different shell / kernel).
+
+## Analyze Use
+
+```python
+r = Reporting('experiment_log.csv')
+
+# returns the results dataframe
+r.data
+
+# returns the highest value for 'val_fmeasure'
+r.high('val_fmeasure')
+
+# returns the number of rounds it took to find best model
+r.rounds2high()
+
+# draws a histogram for 'val_acc'
+r.plot_hist()
+```
+
+Reporting works by loading the experiment log .csv file which is saved locally as part of the experiment. The filename can be changed through dataset_name and experiment_no Scan arguments.
+
+## Analyze Arguments
+
+`Analyze()` has only a single argument `source`. This can be either a .csv file which results `Scan()` or the class object which also results from `Scan()`.
+
+The `Analyze` class object contains several useful properties.
+
+## Analyze Properties
+
+See docstrings for each function for a more detailed description.
+
+**`high`** The highest result for a given metric
+
+**`rounds`**  The number of rounds in the experiment
+
+**`rounds2high`** The number of rounds it took to get highest result
+
+**`low`** The lowest result for a given metric
+
+**`correlate`** A dataframe with Spearman correlation against a given metric
+
+**`plot_line`** A round-by-round line graph for a given metric
+
+**`plot_hist`** A histogram for a given metric where each observation is a permutation
+
+**`plot_corr`** A correlation heatmap where a single metric is compared against hyperparameters
+
+**`plot_regs`** A regression plot with data on two axis
+
+**`plot_box`** A box plot with data on two axis
+
+**`plot_bars`** A bar chart that allows up to 4 axis of data to be shown at once
+
+**`plot_bars`** Kernel Destiny Estimation type histogram with support for 1 or 2 axis of data
+
+**`table`** A sortable dataframe with a given metric and hyperparameters
+
+**`best_params`** A dictionary of parameters from the best model
diff --git a/docs/Asking_Help.md b/docs/Asking_Help.md
@@ -0,0 +1,15 @@
+# 💬 How to get Support
+
+| I want to...                     | Go to...                                                  |
+| -------------------------------- | ---------------------------------------------------------- |
+| **...troubleshoot**           | [Docs] · [Wiki] · [GitHub Issue Tracker]                   |
+| **...report a bug**           | [GitHub Issue Tracker]                                     |
+| **...suggest a new feature**  | [GitHub Issue Tracker]                                     |
+| **...get support**            | [Stack Overflow] · [Spectrum Chat]                         |
+| **...have a discussion**      | [Spectrum Chat]                                            |
+
+[github issue tracker]: https://github.com/automio/talos/issues
+[docs]: https://autonomio.github.io/docs_talos
+[wiki]: https://github.com/autonomio/talos/wiki
+[stack overflow]: https://stackoverflow.com/questions/tagged/talos
+[spectrum chat]: https://spectrum.chat/talos
diff --git a/docs/AutoModel.md b/docs/AutoModel.md
@@ -0,0 +1,28 @@
+# AutoModel
+
+`AutoModel` provides a meaningful way to test several network architectures in an automated manner. Currently there are five supported architectures:
+
+- conv1d
+- lstm
+- bidirectional_lstm
+- simplernn
+- dense
+
+`AutoModel` creates an input model for Scan(). Optimized for being used together with `AutoParams()` and expects one or more of the above architectures to be included in params dictionary, for example:
+
+```python
+
+p = {...
+    'networks': ['dense', 'conv1d', 'lstm']
+    ...}
+
+```
+
+## AutoModel Arguments
+
+Argument | Input | Description
+--------- | ------- | -----------
+`task` | str or None | `binary`, `multi_label`, `multi_class`, or `continuous`
+`metric` | None or list | One or more Keras metric (functions) to be used in the model
+
+Setting `task` effects which various aspects of the model and should be set according to the specific prediction task, or set to `None` in which case `metric` input is required.
diff --git a/docs/AutoParams.md b/docs/AutoParams.md
@@ -0,0 +1,82 @@
+# AutoParams
+
+`AutoParams()` allows automated generation of comprehensive parameter dictionary to be used as input for `Scan()` experiments as well as a streamlined way to manipulate parameter dictionaries.
+
+#### to automatically create a params dictionary
+
+```python
+p = talos.Autom8.AutoParams().params
+
+```
+NOTE: The above example yields a very large permutation space so configure `Scan()` accordingly with `fraction_limit`.
+
+#### an alternative way where a class object is returned
+
+```python
+param_object = talos.Autom8.AutoParams()
+
+```
+
+Now various properties can be accessed through `param_object`, these are detailed below. For example:
+
+#### modifying a single parameter in the params dictionary
+
+```python
+param_object.batch_size(bottom_value=20, max_value=100, steps=10)
+```
+
+Now the modified params dictionary can be accessed through `params_object.params`
+
+#### to append a current parameter dictionary
+
+```python
+params_dict = talos.Autom8.AutoParams(p, task='multi_label').params
+
+```
+NOTE: Note, when the dictionary is created for a prediction task other than 'binary', the `task` argument has to be declared accordingly (`binary`, `multi_label`, `multi_class`, or `continuous`).
+
+## AutoParams Arguments
+
+Argument | Input | Description
+--------- | ------- | -----------
+`params` | dict or None | If `None` then a new parameter dictionary is created
+`task` | str | 'binary', 'multi_class', 'multi_label', or 'continuous'
+`replace` | bool | Replace current dictionary entries with new ones.
+`auto` | bool | automatically generate or append params dictionary with all available parameters.
+`network` | network | If `True` several model architectures will be added
+
+## AutoParams Properties
+
+The **`params`** property returns the parameter dictionary which can be used as an input to `Scan()`.
+
+The **`resample_params`** accepts `n` as input and resamples the params dictionary so that n values remain for each parameter.
+
+All other properties relate with manipulating individual parameters in the parameter dictionary.
+
+**`activations`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`batch_size`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`dropout`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`epochs`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`kernel_initializer`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`last_activation`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`layers`** For controlling the corresponding parameter (i.e. `hidden_layers`) in the parameters dictionary.
+
+**`losses`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`lr`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`networks`** For controlling the Talos present network architectures (`dense`, `lstm`, `bidirectional_lstm`, `conv1d`, and `simplernn`). NOTE: the use of preset networks requires the use of the input model from `AutoModel()` for `Scan()`. 
+
+**`neurons`** For controlling the corresponding parameter (i.e. `first_neuron`) in the parameters dictionary.
+
+**`optimizers`** For controlling the corresponding parameter in the parameters dictionary.
+
+**`shapes`** For controlling the Talos preset network shapes (`brick`, `funnel`, and `triangle`).
+
+**`shapes_slope`** For controlling the shape parameter with a floating point value to set the slope of the network from input layer to output layer.
diff --git a/docs/AutoPredict.md b/docs/AutoPredict.md
@@ -0,0 +1,33 @@
+# AutoPredict
+
+`AutoPredict()` automatically handles the process of finding the best models from a completed `Scan()` experiment, evaluates those models, and uses the winning model to make predictions on input data.
+
+```python
+scan_object = talos.autom8.AutoPredict(scan_object, x_val=x, y_val=y, x_pred=x)
+```
+
+NOTE: the input data must be in same format as 'x' that was used in `Scan()`.
+Also, `x_val` and `y_val` should not have been exposed to the model during the
+`Scan()` experiment.
+
+`AutoPredict()` will add four new properties to `Scan()`:
+
+**`preds_model`** contains the winning Keras model (function)
+**`preds_parameters`** contains the hyperparameters for the selected model
+**`preds_probabilities`** contains the prediction probabilities for `x_pred`
+**`predict_classes`** contains the predicted classes for `x_pred`.
+
+## AutoPredict Arguments
+
+Argument | Input | Description
+--------- | ------- | -----------
+`scan_object` | class object | the class object returned from `Scan()`
+`x_val` | array or list of arrays | validation data features
+`y_val` | array or list of arrays | validation data labels
+`y_pred` | array or list of arrays | prediction data features
+`task` | string | 'binary', 'multi_class', 'multi_label', or 'continuous'
+`metric` | None | the metric against which the validation is performed
+`n_models` | int | number of promising models to be included in the evaluation process
+`folds` | None | number of folds to be used for cross-validation
+`shuffle` | None | if data is shuffled before splitting
+`asc` | None | should be True if metric is a loss
diff --git a/docs/AutoScan.md b/docs/AutoScan.md
@@ -0,0 +1,31 @@
+# AutoScan
+
+`AutoScan()` provides a streamlined way for conducting a hyperparameter search experiment with any dataset. It is particularly useful for early exploration as with default settings `AutoScan()` casts a very broad parameter space including all common hyperparameters, network shapes, sizes, as well as architectures
+
+Configure the `AutoScan()` experiment and then use the property `start` in the returned class object to start the actual experiment.
+
+```python
+auto = talos.autom8.AutoScan(task='binary', max_param_values=2)
+auto.start(x, y, experiment_name='testing.new', fraction_limit=0.001)
+```
+
+NOTE: `auto.start()` accepts all `Scan()` arguments.
+
+## AutoScan Arguments
+
+Argument | Input | Description
+--------- | ------- | -----------
+`task` | str or None | `binary`, `multi_label`, `multi_class`, or `continuous`
+`max_param_values` | int | Number of parameter values to be included
+
+Setting `task` effects which various aspects of the model and should be set according to the specific prediction task, or set to `None` in which case `metric` input is required.
+
+## AutoScan Properties
+
+The only property **`start`** starts the actual experiment. `AutoScan.start()` accepts the following arguments:
+
+Argument | Input | Description
+--------- | ------- | -----------
+`x` | array or list of arrays | prediction features
+`y` | array or list of arrays | prediction outcome variable
+`kwargs` | arguments | any `Scan()` argument can be passed into `AutoScan.start()`
diff --git a/docs/Custom_Reducers.md b/docs/Custom_Reducers.md
@@ -0,0 +1,17 @@
+# Custom Reducer
+
+A custom reduction strategy can be created and dropped into Talos. Read more about the reduction principle
+
+There are only two criteria to meet:
+
+- The input of the custom strategy is 2-dimensional
+- The output of the custom strategy is in the form:
+
+```python
+return label, value
+```
+Here `value` is any hyperparameter value, and `label` is the name of any hyperparameter. Any arbitrary strategy can be implemented, as long as the input and output criteria are met.
+
+The file containing the strategy can then be placed in `/reducers` in Talos package, and corresponding changes made into `/reducers/reduce_run.py` to make the strategy available in `Scan()`. Having done this, the reduction strategy is now available as per the example [above](#probabilistic-reduction).
+
+A [pull request](https://github.com/autonomio/talos/pulls) is highly encouraged once a beneficial reduction strategy has been successfully added.
diff --git a/docs/Deploy.md b/docs/Deploy.md
@@ -0,0 +1,36 @@
+# Deploy()
+
+A successful experiment can be deployed easily. Deploy() takes in the object from Scan() and creates a package locally that can be later activated with Restore().
+
+```python
+from talos import Deploy
+
+Deploy(scan_object, 'experiment_name')
+```
+
+When you've achieved a successful result, you can use `Deploy()` to prepare a production ready package that can be easily transferred to another environment or system, or sent or uploaded. The deployment package will consists of the best performing model, which is picked base on the `metric` argument.
+
+NOTE: for a metric that is to be minimized, set `asc=True` or otherwise
+you will end up with the model that has the highest loss.
+
+## Deploy Arguments
+
+Parameter | type | Description
+--------- | ------- | -----------
+`scan_object` | class object | a `Scan` object
+`model_name` | str | Name for the .zip file to be created.
+`metric` | str | The metric to be used for picking the best model.
+`asc` | bool | Make this True for metrics that are to be minimized (e.g. loss)
+
+## Deploy Package Contents
+
+The deploy package consists of:
+
+- details of the scan (details.txt)
+- model weights (model.h5)
+- model json (model.json)
+- results of the experiment (results.csv)
+- sample of x data (x.csv)
+- sample of y data (y.csv)
+
+The package can be restored into a copy of the original Scan object using the `Restore()` command.
diff --git a/docs/Evaluate.md b/docs/Evaluate.md
@@ -0,0 +1,34 @@
+## Evaluate()
+
+Once the `Scan()` experiment procedures have been completed, the resulting class object can be used as input for `Evaluate()` in order to evaluate one or more models.
+
+```python
+from talos import Evaluate
+
+# create the evaluate object
+e = Evaluate(scan_object)
+
+# perform the evaluation
+e.evaluate(x, y, average='macro')
+```
+
+NOTE: It's very important to save part of your data for evaluation, and keep it completely separated from the data you use for the actual experiment. A good approach would be where 50% of the data is saved for evaluation.
+
+### Evaluate Properties
+
+`Evaluate()` has just one property, **`evaluate`**, which is used for evaluating one or more models.
+
+### Evaluate.evaluate Arguments
+
+Parameter | Default | Description
+--------- | ------- | -----------
+`x` | NA | the predictor data x
+`y` | NA | the prediction data y (truth)
+`model_id` | None | the model_id to be used
+`folds` | None | number of folds to be used for cross-validation
+`shuffle` | None | if data is shuffled before splitting
+`average` | 'binary' | 'binary', 'micro', 'macro', 'samples', or 'weighted'
+`metric` | None | the metric against which the validation is performed
+`asc` | None | should be True if metric is a loss
+
+The above arguments are for the <code>evaluate</code> attribute of the <code>Evaluate</code> object.