1.6.0
Highlights
- Minimal pip install requirements
- Provide separate installations for different CPU/GPU runtimes (cpu, cuda, ROCm)
Python compatibility
Python 3.9.x
Documentation
- Improved the AMPL README with better logic flow and topic grouping.
- Enhanced the API documentation.
- Removed private modules from the API list
- Updated all Python code to PEP 257 / Google docstring convention for consistent formatting
and so that all public modules and functions are included in API documentation.
Enhancements:
- Provided Dockerfile for a local AMPL Docker image build.
- Added a parameter to train a model in production mode, where all data are used to train model.
- Added full support for all XGBoost model parameters, including in hyperopt searches.
- Added split_strategy output column to compare_models.get_filesystem_perf_results.
- Added script for patching model tarballs to point to local copy of training data (needed for AD computation).
- Save the class_number parameter for multiclass classification models.
- Added option to map SMILES strings to canonical tautomers in standardization functions rdkit_smiles_from_smiles and base_smiles_from_smiles.
- Added model_file_reader module to simplify extraction of saved model metadata.
- Added function to plot predicted vs actual responses with saved regression models.
- Added module to plot nearest neighbor Tanimoto distance distributions between training and validation/test sets.
- Added module to plot response value distributions for split subsets.
- Updated diversity_plots to allow a user-specified color palette and increase the resolution of the figure
Bug Fixes:
- Made get_featurized_data() check if all the smiles in a dataset are represented in the prefeaturized data
- Fixed bug in setting response column weights to make it consistent across featurizers.
- Fixed error handling in rdkit_easy.mol_to_html to return empty string rather than None.
- Fixed the Tanimoto distance plot to reflect the nearest neighbor distance instead of all distances.
- Fixed freq_table's handling of nans in selected columns
- Fixed bug in setting response column weights to make it consistent across featurizers.
- Fixed error handling in rdkit_easy.mol_to_html to return empty string rather than None.
- Fixed bug in EmbeddingFeaturization where descriptors were not transformed before input to embedding model.