diff --git a/AUTHORS.rst b/AUTHORS.rst index 278be9a5..dfb4f0c4 100644 --- a/AUTHORS.rst +++ b/AUTHORS.rst @@ -1,4 +1,3 @@ -======= Credits ======= diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst index 73ac4f4f..8867adff 100644 --- a/CONTRIBUTING.rst +++ b/CONTRIBUTING.rst @@ -1,19 +1,28 @@ .. highlight:: shell -============ -Contributing -============ +Welcome to the Community +======================== -Contributions are welcome, and they are greatly appreciated! Every little bit -helps, and credit will always be given. +MLPrimitive library is an open source compendium of all the possible data transforms +that are used by machine learning practitioners. -You can contribute in many ways: +It is a community driven effort, so it relies on the community. For this reason, we designed it +thoughtfully so much of the contributions here can have shelf life greater than any of the +machine learning libraries it integrates, as it represents the combined knowledge of all the +contributors and allows many different systems to be built using the annotations themselves. -Types of Contributions -====================== +So, are you ready to join the community? If so, please feel welcome and keep reading! + +Types of contributions +---------------------- + +There are several ways to contribute to a project like **MLPrimitives**, and they do not always +involve coding. + +If you want to contribute but do not know where to start, consider one of the following options: Reporting Issues ----------------- +~~~~~~~~~~~~~~~~ If there is something that you would like to see changed in the project, or that you just want to ask, please create an issue at https://github.com/HDI-Project/MLPrimitives/issues @@ -28,18 +37,18 @@ If you do so, please: Below there are some examples of the types of issues that you might want to create. Request new primitives -~~~~~~~~~~~~~~~~~~~~~~ +********************** -Sometimes you will feel that a necessary primitive is missing and should be integrated. +Sometimes you will feel that a necessary primitive is missing and should be added. In this case, please create an issue indicating the name of the primitive and a link to its documentation. If the primitive documentation is unclear or not precise enough to know what needs to be -integrated only by reading it, please add as many details as necessary in the issue description. +done only by reading it, please add as many details as necessary in the issue description. Request new features -~~~~~~~~~~~~~~~~~~~~ +******************** If there is any other feature that you would like to see implemented, such as adding new functionalities to the existing custom primitives, or changing their behavior to cover @@ -49,7 +58,7 @@ If you do so, please indicate all the details about what you request as well as cases of the new feature. Report Bugs -~~~~~~~~~~~ +*********** If you find something that fails, please report it including: @@ -58,48 +67,11 @@ If you find something that fails, please report it including: * Detailed steps to reproduce the bug. Ask for Documentation -~~~~~~~~~~~~~~~~~~~~~ +********************* If there is something that is not documented well enough, do not hesitate to point at that in a new issue and request the necessary changes. -Implementing Changes --------------------- - -If you want to contribute to the project with your own changes, you are more than welcome -to do so! :) - -In this case, please do the following steps: - -1. Indicate your intentions in a GitHub issue, by saying so in the project description or in - a comment. If no issue exists yet for the changes that you want to implement, please - create one. -2. After you have done so, please wait for the feedback from the maintainers, who will approve - the issue and assign it to you, before proceeding to implement any changes. -3. Implement the necessary changes in your own fork of the project. Please implement them in - a branch named after the issue number and title. -4. Make sure that your changes include unit tests and that the existing tests and quality - checks are all executed successfully. -5. Push all your changes to GitHub and open a Pull Request, indicating what was implemented - in the description. - -Below there are some more details about each type of contribution possible. - -Integrate new primitives -~~~~~~~~~~~~~~~~~~~~~~~~ - -If you want to contribute integrating new third party primitives, you are welcome to contribute -the necessary JSON annotations and Python adapters. - -Implement new primitives -~~~~~~~~~~~~~~~~~~~~~~~~ - -If what you want to implement is not available in any third party library, you can also contribute -the necessary Python code directly in any of the `mlprimitives` sub-modules. - -In this case, please remember to also include the necessary JSON annotations, as well as the -corresponding documentation. - Write Documentation ~~~~~~~~~~~~~~~~~~~ @@ -108,115 +80,14 @@ docs, in docstrings, or even on the web in blog posts, articles, and such, so fe contribute any changes that you deem necessary, from fixing a simple typo, to writing whole new pages of documentation. -Get Started! -============ - -Ready to contribute? Here's how to set up `MLPrimitives` for local development. - -1. Fork the `MLPrimitives` repo on GitHub. -2. Clone your fork locally:: - - $ git clone git@github.com:your_name_here/MLPrimitives.git - -3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, - this is how you set up your fork for local development:: - - $ mkvirtualenv MLPrimitives - $ cd MLPrimitives/ - $ make install-develop - -4. Create a branch for local development:: - - $ git checkout -b name-of-your-bugfix-or-feature - - Now you can make your changes locally. - -5. While hacking your changes, make sure to cover all your developments with the required - unit tests, and that none of the old tests fail as a consequence of your changes. - For this, make sure to run the tests suite and check the code coverage:: - - $ make test # Run the tests - $ make coverage # Get the coverage report - -6. When you're done making changes, check that your changes pass flake8 and the - tests, including testing other Python versions with tox:: - - $ make lint # Check code styling - $ make test-all # Execute tests on all python versions +Contribute code +~~~~~~~~~~~~~~~ -7. Make also sure to include the necessary documentation in the code as docstrings following - the `google docstring`_ style. - If you want to view how your documentation will look like when it is published, you can - generate and view the docs with this command:: +Obviously, the main element in the MLPrimitives library is the code. - $ make viewdocs +If you are willing to contribute to it, please check the documentation for more details about +how to proceed! -8. Commit your changes and push your branch to GitHub:: - - $ git add . - $ git commit -m "Your detailed description of your changes." - $ git push origin name-of-your-bugfix-or-feature - -9. Submit a pull request through the GitHub website. - -.. _google docstring: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html - -Pull Request Guidelines -======================= - -Before you submit a pull request, check that it meets these guidelines: - -1. It resolves an open GitHub Issue and contains its reference in the title or - the comment. If there is no associated issue, feel free to create one. -2. Whenever possible, it resolves only **one** issue. If your PR resolves more than - one issue, try to split it in more than one pull request. -3. The pull request should include unit tests that cover all the changed code -4. If the pull request adds functionality, the docs should be updated. Put - your new functionality into a function with a docstring, and add the - feature to the list in README.rst. -5. The pull request should work for Python2.7, 3.4, 3.5 and 3.6. Check - https://travis-ci.org/HDI-Project/MLPrimitives/pull_requests - and make sure that all the checks pass. - -Unit Testing Guidelines -======================= - -All the Unit Tests should comply with the following requirements: - -1. Unit Tests should be based only in unittest and pytest modules. - -2. The tests that cover a module called ``mlprimitives/path/to/a_module.py`` should be - implemented in a separated module called ``tests/mlprimitives/path/to/test_a_module.py``. - Note that the module name has the ``test_`` prefix and is located in a path similar - to the one of the tested module, just inside te ``tests`` folder. - -3. Each method of the tested module should have at least one associated test method, and - each test method should cover only **one** use case or scenario. - -4. Test case methods should start with the ``test_`` prefix and have descriptive names - that indicate which scenario they cover. - Names such as ``test_some_methed_input_none``, ``test_some_method_value_error`` or - ``test_some_method_timeout`` are right, but names like ``test_some_method_1``, - ``some_method`` or ``test_error`` are not. - -5. Each test should validate only what the code of the method being tested does, and not - cover the behavior of any third party package or tool being used, which is assumed to - work properly as far as it is being passed the right values. - -6. Any third party tool that may have any kind of random behavior, such as some Machine - Learning models, databases or Web APIs, will be mocked using the ``mock`` library, and - the only thing that will be tested is that our code passes the right values to them. - -7. Unit tests should not use anything from outside the test and the code being tested. This - includes not reading or writting to any filesystem or database, which will be properly - mocked. - -Tips -==== - -To run a subset of tests:: - - $ pytest tests.test_mlprimitives Release Workflow ================ @@ -225,17 +96,26 @@ The process of releasing a new version involves several steps combining both ``g ``bumpversion`` which, briefly: 1. Merge what is in ``master`` branch into ``stable`` branch. -2. Update the version in ``setup.cfg``, ``mlprimitives/__init__.py`` and ``HISTORY.md`` files. -3. Create a new TAG pointing at the correspoding commit in ``stable`` branch. +2. Update the version in the code and condiguration files. +3. Create a new git tag pointing at the corresponding commit in ``stable`` branch. 4. Merge the new commit from ``stable`` into ``master``. -5. Update the version in ``setup.cfg`` and ``mlprimitives/__init__.py`` to open the next - development interation. +2. Update the version in the code and condiguration files again to start the next development iteration. + +.. note:: Before starting the process, make sure that ``HISTORY.md`` has been updated with a new + entry that explains the changes that will be included in the new version. + Normally this is just a list of the Pull Requests that have been merged to master + since the last release. + +Once this is done, run of the following commands: + +1. If you are releasing a patch version:: + + make release + +2. If you are releasing a minor version:: + + make release-minor -**Note:** Before starting the process, make sure that ``HISTORY.md`` has a section titled -after thew new version that is about to be released with the list of changes that will be -included in the new version, and that these changes are all committed and available in the -``master`` branch. -Normally this is just a list of the Issues that have been closed since the latest version. +3. If you are releasing a major version:: -Once this is done, just run the commands ``make release`` and insert the PyPi username and -password when required. + make release-major diff --git a/Makefile b/Makefile index 45fb090f..79d5d4b3 100644 --- a/Makefile +++ b/Makefile @@ -136,7 +136,7 @@ view-docs: docs ## view docs in browser .PHONY: serve-docs serve-docs: view-docs ## compile the docs watching for changes - watchmedo shell-command -W -R -D -p '*.rst;*.md' -c '$(MAKE) -C docs html' . + watchmedo shell-command -W -R -D -p '*.rst;*.md' -c '$(MAKE) -C docs html' docs # RELEASE TARGETS diff --git a/docs/community/adapters.rst b/docs/community/adapters.rst new file mode 100644 index 00000000..a962a8d4 --- /dev/null +++ b/docs/community/adapters.rst @@ -0,0 +1,121 @@ +Contributing Adapters +===================== + +If the primitives that you want to add are not compliant with our `fit-produce` schema you will +probably need to either add an adapter or modify an existing one in order to add them. + +Creating a new Adapter +---------------------- + +If you want to create a new adapter, please follow these steps: + +1. If it does not exist yet, create a new GitHub issue requesting the primitive that requires it. + As indicated previously, provide as many details as possible about the new primitive, like + links to the documentation, what it does and what it is useful for, as well as details about + why you think a new adapter is needed and how it would work. +2. Indicate in the issue description or in a comment that you are available to apply the changes + yourself. +3. Wait for the feedback from the maintainers, who will approve the issue and assign it to you, + before proceeding to implement any changes. Be open to discuss with them about the need + of adding this primitive, as maybe there are other primitive that offer the same functionality, + and about whether the adapter will be needed or not, or what is should look like. +4. Once the issue has been approved and assigned to you, implement the necessary changes in your + own fork of the project. Please implement them in a branch named after the issue number and + title, as this makes keeping track of the history of the project easier in the long run. + + 1. Create an adapter python module and add it to the ``mlprimitives/adapters/`` directory. + The name of the module should be the name of the library that you want to create an adapter + for. For example, if you want to add an adapter to add primitives from the library called + ``cool-ml``, the name of the module should be ``mlprimitives/adapters/cool_ml.py``. + If the module already exists because there is another adapter for the same library, create + the new adapter within the same module. + 2. Inside the adapter module, try to name the class or function that you create as similar + as possible to the classes that you are writing the adapter for. + For example, the adapter class for the ``keras.Sequential`` class is called + ``mlprimitives.adapters.keras.Sequential``, while the adapter class for the + ``featuretools.dfs`` method is called ``mlprimitives.adapters.featuretools.DFS``. + 3. As usual, when writing python code, make sure to follow a coding style consistent with + the rest of the library, and to follow all the guidelines form the :ref:`contributing` + section. + 4. Do not forget to properly document your code and cover it with proper unit testing! + 5. Create at least on JSON annotation that uses your adapter. When doing so, make sure to + follow the corresponding conventions: + + 1. The name of the file should correspond to the fully qualified name of the class or + function that you are integrating ignoring the fact that you are using an adapter. + For example, if you are adding the primitive ``CoolPrimitive`` from the module + ``cool_ml.module`` by using the ``mlprimitives/adapters/cool_ml.CoolML`` + adapter, the name of the file should be ``cool_ml.module.CoolPrimitive.json``. + 2. Inside the JSON annotation, the ``primitive`` entry should have the fully qualified + name of your adapter class, and the ``fixed`` hyperparameters should contain all + the details that your adapter needs to know how to integrate the third party primitive. + 3. Add proper description of what the primitive does in the corresponding entry, as well + as a link to its documentation. If there is no documentation available, put the link + to its source code. And don't forget to add you name and e-mail address to the + ``contributors`` list! + 4. Add a pipeline annotation that uses your primitive inside the pipelines folder, named + exactly like your primitive, and test it with the command + ``mlprimitives test pipelines/your.pipeline.json``. + If adding a pipeline is not possible for any reason, please inform the maintainers, as + this probably means that a new dataset needs to be added. + +5. Review your changes and make sure that everything continues to work properly by executing the + ``make test-all`` command. +6. Push all your changes to GitHub and open a Pull Request, indicating in the description which + issue you are resolving and what the changes consist of. + +Modifying an existing Adapter +----------------------------- + +If an adapter for the library already exists but it does not properly cover one of the primitives +that you want to integrate, you might find that modifying the existing adapter adds this coverage. + +In this case, if you are sure that these modifications will not break previous functionality, +and the existing adapter can be safely modified, do the following steps: + +1. If it does not exist yet, create a new GitHub issue requesting the primitive that requires it. + As indicated previously, provide as many details as possible about the new primitive, like + links to the documentation, what it does and what it is useful for, as well as details about + why you think the current adapter needs to be modified and how. +2. Indicate in the issue description or in a comment that you are available to apply the changes + yourself. +3. Wait for the feedback from the maintainers, who will approve the issue and assign it to you, + before proceeding to implement any changes. Be open to discuss with them about the need + of adding this primitive, as maybe there are other primitive that offer the same functionality, + and about whether the adapter can be modified or a new one created. +4. Once the issue has been approved and assigned to you, implement the necessary changes in your + own fork of the project. Please implement them in a branch named after the issue number and + title, as this makes keeping track of the history of the project easier in the long run. + + 1. Do the necessary modifications in the existing adapter. + 2. As usual, when writing python code, make sure to follow a coding style consistent with + the rest of the library, and to follow all the guidelines form the :ref:`contributing` + section. + 3. Do not forget to properly document your code and cover it with proper unit testing! + 4. Create at least one new JSON annotation that uses the adapter. When doing so, make sure to + follow the corresponding conventions: + + 1. The name of the file should correspond to the fully qualified name of the class or + function that you are integrating ignoring the fact that you are using an adapter. + For example, if you are adding the primitive ``CoolPrimitive`` from the module + ``cool_ml.module`` by using the ``mlprimitives/adapters/cool_ml.CoolML`` + adapter, the name of the file should be ``cool_ml.module.CoolPrimitive.json``. + 2. Inside the JSON annotation, the ``primitive`` entry should have the fully qualified + name of your adapter class, and the ``fixed`` hyperparameters should contain all + the details that your adapter needs to know how to integrate the third party primitive. + 3. Add proper description of what the primitive does in the corresponding entry, as well + as a link to its documentation. If there is no documentation available, put the link + to its source code. And don't forget to add you name and e-mail address to the + ``contributors`` list! + 4. Add a pipeline annotation that uses your primitive inside the pipelines folder, named + exactly like your primitive, and test it with the command + ``mlprimitives test pipelines/your.pipeline.json``. + If adding a pipeline is not possible for any reason, please inform the maintainers, as + this probably means that a new dataset needs to be added. + 5. Make sure that all the primitives that existed before that use the same adapter still + work by testing their corresponding pipelines with the command above. + +5. Review your changes and make sure that everything continues to work properly by executing the + ``make test-all`` command. +6. Push all your changes to GitHub and open a Pull Request, indicating in the description which + issue you are resolving and what the changes consist of. diff --git a/docs/community/annotations.rst b/docs/community/annotations.rst new file mode 100644 index 00000000..4517e5e6 --- /dev/null +++ b/docs/community/annotations.rst @@ -0,0 +1,128 @@ +Contributing Annotations +======================== + +The simplest type of contributions are the ones that only involve modifications on JSON +annotations. + +These can modifications can come in different ways: + +Creating an annotation for a new primitive +------------------------------------------ + +The most usual scenario will be adding a new primitive that does not exist yet in the repository +and that can be directly integrated by writing a simple JSON annotation. + +In this case, please follow these steps: + +1. If it does not exist yet, create a new GitHub issue requesting the new primitive. As indicated + previously, provide as many details as possible about the new primitive, like links to the + documentation, what it does and what it is useful for. +2. Indicate in the issue description or in a comment that you are available to apply the changes + yourself. +3. Wait for the feedback from the maintainers, who will approve the issue and assign it to you, + before proceeding to implement any changes. Be open to discuss with them about the need + of adding this primitive, as maybe there are other primitive that offer the same functionality, + and about the best approach to add it. +4. Once the issue has been approved and assigned to you, implement the necessary changes in your + own fork of the project. Please implement them in a branch named after the issue number and + title, as this makes keeping track of the history of the project easier in the long run. + + 1. Create a new JSON annotation. This can be made from scratch, or you can copy another one + and modify it. + 2. The name of the file should correspond to the fully qualified name of the class or function + that you are referencing inside the primitive. For example, if you are adding a primitive + that uses the class ``CoolPrimitive`` from the module ``super.cool.module``, the name of + the file should be ``super.cool.module.CoolPrimitive.json``. + 3. Add proper description of what the primitive does in the corresponding entry, as well as a + link to its documentation. If there is no documentation available, put the link to its + source code. And don't forget to add you name and e-mail address to the ``contributors`` list! + 4. Add a pipeline annotation that uses your primitive inside the pipelines folder, named + exactly like your primitive, and test it with the command + ``mlprimitives test pipelines/your.pipeline.json``. + If adding a pipeline is not possible for any reason, please inform the maintainers, as + this probably means that a new dataset needs to be added. + +5. Review your changes and make sure that everything continues to work properly by executing the + ``make test-all`` command. +6. Push all your changes to GitHub and open a Pull Request, indicating in the description which + issue you are resolving and what the changes consist of. + +Modifying an existing annotation +-------------------------------- + +Sometimes you might think that an existing annotation needs to be modified in some way. + +Usually this is because one of the following reasons: + +* There is an error in the JSON that prevents it from working properly +* Some hyperparameters are not properly exposed, or not exposed at all +* Documentation is not complete enough + +In this case, please follow these steps: + +1. Create a new GitHub issue explaining what needs to be changed and why. +2. Indicate in the issue description or in a comment that you are available to apply the changes + yourself. +3. Wait for the feedback from the maintainers, who will approve the issue and assign it to you, + before proceeding to implement any changes. Be open to discussion, as sometimes you might find + out that some of the things that you considered an error are actually intentional. For example, + a hyperparameter that you consider missing might have been intentionally left out for + performance or compatibility issues. +4. Once the issue has been approved and assigned to you, implement the necessary changes in your + own fork of the project. Please implement them in a branch named after the issue number and + title, as this makes keeping track of the history of the project easier in the long run. Don't + forget to add you name and e-mail address to the ``contributors`` list while you are at it! +5. Make sure that the annotation still works by testing the corresponding pipeline. Normally, + this can be done by running the command ``mlprimitives test pipelines/your.pipeline.json``. +6. Review your changes and make sure that everything continues to work properly by executing the + ``make test-all`` command. +7. Push all your changes to GitHub and open a Pull Request, indicating in the description which + issue you are resolving and what the changes consist of. + +Creating a new version of an existing annotation +------------------------------------------------ + +Sometimes you might find that a primitive for a particular annotation already exists, but that +modifying it in some way allows adapting it more precisely to some particular scenarios while, +at the same time making this unusable for others + +Some examples of this would include: + +* Use the ``predict_proba`` method instead of the ``predict`` one in a scikit-learn classifier. +* Alter the hyperparameter ranges to make the primitive more efficient when working with certain + type of data or problems. + +In this cases, what you need to do is to create a new annotation which is basically a copy of +the other one with some modifications. + +In this case, please follow these steps: + +1. Create a new GitHub issue explaining what needs to be changed and why. +2. Indicate in the issue description or in a comment that you are available to apply the changes + yourself. +3. Wait for the feedback from the maintainers, who will approve the issue and assign it to you, + before proceeding to implement any changes. As always, be open to discussion, as sometimes you + might find that the behavior which you want to cover is already achievable by using certain + ``init_params``. +4. Once the issue has been approved and assigned to you, implement the necessary changes in your + own fork of the project. Please implement them in a branch named after the issue number and + title, as this makes keeping track of the history of the project easier in the long run. + + 1. Make a copy of the original JSON annotation. + 2. The name of the file should be the same as the original one, with a suffix added after the + last dot ``.`` indicating what the changes are. For example, if you are adapting the + hyperparameters of the primitive named ``some.cool.primitive.json`` to improve its + performance on huge datasets, you might name the new file + ``some.cool.primitive.huge_datasets.json``. + 3. Apply the necessary changes to the new file and add it to the repository. Don't forget to + add you name and e-mail address to the ``contributors`` list while you are at it! + 4. Add a pipeline annotation that uses your primitive inside the pipelines folder, named + exactly like your primitive, and test it with the command + ``mlprimitives test pipelines/your.pipeline.json``. + If adding a pipeline is not possible for any reason, please inform the maintainers, as + this probably means that a new dataset needs to be added. + +5. Review your changes and make sure that everything continues to work properly by executing the + ``make test-all`` command. +6. Push all your changes to GitHub and open a Pull Request, indicating in the description which + issue you are resolving and what the changes consist of. diff --git a/docs/community/contributing.rst b/docs/community/contributing.rst new file mode 100644 index 00000000..a3bf65dd --- /dev/null +++ b/docs/community/contributing.rst @@ -0,0 +1,111 @@ +.. _contributing: + +Contributing Guidelines +======================= + +Ready to contribute with your own code? Great! + +Before diving deeper into the contributing guidelines, please make sure to having read +the :ref:`concepts` section and to have gone through the :ref:`development` guide. + +Afterwards, please make sure to read the following contributing guidelines carefully, and +later on head to the step-by-step guides for each possible type of contribution. + +General Coding Guidelines +************************* + +Once you have set up your development environment, you are ready to start working on your +python code. + +When doing so, make sure to follow these guidelines: + +1. If it does not exist yet, create a new GitHub issue requesting the new primitive. As indicated + previously, provide as many details as possible about the new primitive, like links to the + documentation, what it does and what it is useful for. + +2. Indicate in the issue description or in a comment that you are available to apply the changes + yourself. + +3. Wait for the feedback from the maintainers, who will approve the issue and assign it to you, + before proceeding to implement any changes. Be open to discuss with them about the need + of adding this primitive, as maybe there are other primitive that offer the same functionality, + and about the best approach to add it. + +4. Once the issue has been approved and assigned to you, implement the necessary changes in your + own fork of the project. Please implement them in a branch named after the issue number and + title, as this makes keeping track of the history of the project easier in the long run. + + You can create such a branch with the following command:: + + $ git checkout -b name-of-your-bugfix-or-feature + +5. While hacking your changes, make sure to cover all your developments with the required + unit tests, and that none of the old tests fail as a consequence of your changes. + For this, make sure to run the tests suite and check the code coverage:: + + $ make test # Run the tests + $ make coverage # Get the coverage report + +6. If you are developing new primitives that can work as part of a Pipeline, please also + add a demo pipeline inside the ``pipelines`` folder and validate that it is running + properly with the command:: + + $ mlprimitives test pipelines/the_file_of_your_pipeline.json + +7. When you're done making changes, check that your changes pass flake8 and the + tests, including testing other Python versions with tox:: + + $ make lint # Check code styling + $ make test-all # Execute tests on all python versions + +8. Make also sure to include the necessary documentation in the code as docstrings following + the `google docstring`_ style. + If you want to view how your documentation will look like when it is published, you can + generate and view the docs with this command:: + + $ make viewdocs + +9. Commit your changes and push your branch to GitHub:: + + $ git add . + $ git commit -m "Your detailed description of your changes." + $ git push origin name-of-your-bugfix-or-feature + +10. Submit a pull request through the GitHub website and wait for feedback from the maintainers. + +.. _google docstring: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html + + +Unit Testing Guidelines +*********************** + +If you are going to contribute Python code, we will ask you to write unit tests that cover +your development, following these requirements: + +1. Unit Tests should be based only in unittest and pytest modules. + +2. The tests that cover a module called ``mlprimitives/path/to/a_module.py`` should be + implemented in a separated module called ``tests/mlprimitives/path/to/test_a_module.py``. + Note that the module name has the ``test_`` prefix and is located in a path similar + to the one of the tested module, just inside the ``tests`` folder. + +3. Each method of the tested module should have at least one associated test method, and + each test method should cover only **one** use case or scenario. + +4. Test case methods should start with the ``test_`` prefix and have descriptive names + that indicate which scenario they cover. + Names such as ``test_some_methed_input_none``, ``test_some_method_value_error`` or + ``test_some_method_timeout`` are right, but names like ``test_some_method_1``, + ``some_method`` or ``test_error`` are not. + +5. Each test should validate only what the code of the method being tested does, and not + cover the behavior of any third party package or tool being used, which is assumed to + work properly as far as it is being passed the right values. + +6. Any third party tool that may have any kind of random behavior, such as some Machine + Learning models, databases or Web APIs, will be mocked using the ``mock`` library, and + the only thing that will be tested is that our code passes the right values to them. + +7. Unit tests should not use anything from outside the test and the code being tested. This + includes not reading or writing to any file system or database, which will be properly + mocked. diff --git a/docs/community/custom.rst b/docs/community/custom.rst new file mode 100644 index 00000000..db98b664 --- /dev/null +++ b/docs/community/custom.rst @@ -0,0 +1,116 @@ +Contributing Custom Primitives +============================== + +Sometimes, the functionality that you want to add is not implemented yet by any other third +party tool, which means that you will need to implement that from scratch. + +In these cases, you can either a new custom primitive or modify one of the existing ones to +add the new functionality to it. + +Creating a Custom Primitive +--------------------------- + +If you want to create a new custom primitive, please follow these steps: + +1. If it does not exist yet, create a new GitHub issue requesting a primitive that does the + desired functionality, providing as many details as possible about the new primitive, including + a thorough description of what it does and what it is useful for. +2. Indicate in the issue description or in a comment that you are available to apply the changes + yourself, and provide an initial implementation proposal as detailed as possible. Include in + this description the modules, classes and functions that you will create, as well as + a clear description of the inputs and outputs of the primitive. +3. Wait for the feedback from the maintainers, who will approve the issue and assign it to you, + before proceeding to implement any changes. Be open to discuss with them about the need + of adding this primitive, as maybe there are other primitives that offer the same functionality, + or they want to suggest a different implementation. +4. Once the issue has been approved and assigned to you, implement the necessary changes in your + own fork of the project. Please implement them in a branch named after the issue number and + title, as this makes keeping track of the history of the project easier in the long run. + + 1. If it does not exist yet, create a python module inside the ``mlprimitives/candidates/`` + folder named after the type of primitive that you want to implement. Some good names + could be `text_preprocessing`, `feature_extraction` or `timeseries_anomalies`. + 2. Implement the new primitive inside the corresponding module following as closely as + possible the implementation discussed in the GitHub issue. If you feel that you need to + deviate from it, please make a comment in the GitHub issue explaining why. + 3. As usual, when writing python code, make sure to follow a coding style consistent with + the rest of the library, and to follow all the guidelines form the :ref:`contributing` + section. + 4. Do not forget to properly document your code and cover it with unit tests! + 5. Create at least on JSON annotation that uses your primitive. When doing so, make sure to + follow the corresponding conventions: + + 1. The name of the file should correspond to the fully qualified name of the class or + function which the primitive consists of. + For example, if you are adding the primitive ``YourPrimitive`` from the module + ``mlprimitives.candidates.your_module``, the name of the file should be + ``mlprimitives.candidates.your_module.YourPrimitive.json``. + 2. Add proper description of what the primitive does in the corresponding entry, as well + as a link to its documentation. If there is no documentation available, put the link + to its source code. If the implementation follows a proposal from a scientific paper, + consider adding the link to the PDF as well. And don't forget to add you name and + e-mail address to the ``contributors`` list! + 3. Add a pipeline annotation that uses your primitive inside the pipelines folder, named + exactly like your primitive, and test it with the command + ``mlprimitives test pipelines/mlprimitives.candidates.your_module.YourPrimitive.json``. + If adding a pipeline is not possible for any reason, please inform the maintainers, as + this probably means that a new dataset needs to be added. + +5. Review your changes and make sure that everything continues to work properly by executing the + ``make test-all`` command. +6. Push all your changes to GitHub and open a Pull Request, indicating in the description which + issue you are resolving and what the changes consist of. + +Modifying a Custom Primitive +---------------------------- + +If there is a custom primitive that covers the functionality that you want but it does not +support some particularities of your use case, you might want to modify it to add some new +features or extend its functionality. + +In this case, if you are sure that these modifications will not break previous functionality, +and that the existing primitive can be safely modified, do the following steps: + +1. If it does not exist yet, create a new GitHub issue requesting the new feature, providing + as many details as possible about why the change is needed. +2. Indicate in the issue description or in a comment that you are available to apply the changes + yourself, and provide an implementation proposal as detailed as possible. +3. Wait for the feedback from the maintainers, who will approve the issue and assign it to you, + before proceeding to implement any changes. Be open to discuss with them about the need + of adding this new feature, as maybe there are other primitive that offer the same functionality, + or they want to suggest a different implementation. +4. Once the issue has been approved and assigned to you, implement the necessary changes in your + own fork of the project. Please implement them in a branch named after the issue number and + title, as this makes keeping track of the history of the project easier in the long run. + + 1. Do the necessary modifications in the existing primitive. + 2. As usual, when writing python code, make sure to follow a coding style consistent with + the rest of the library, and to follow all the guidelines form the :ref:`contributing` + section. + 3. Do not forget to properly document your code and cover it with proper unit testing! + 4. Make sure that at least one JSON annotation exists that uses the new feature. + While doing so, make sure to follow the corresponding conventions: + + 1. The name of the file should correspond to the fully qualified name of the class or + function which the primitive consists of. + For example, if you are adding the primitive ``YourPrimitive`` from the module + ``mlprimitives.candidates.your_module``, the name of the file should be + ``mlprimitives.candidates.your_module.YourPrimitive.json``. + 2. Add proper description of what the primitive does in the corresponding entry, as well + as a link to its documentation. If there is no documentation available, put the link + to its source code. If the implementation follows a proposal from a scientific paper, + consider adding the link to the PDF as well. And don't forget to add you name and + e-mail address to the ``contributors`` list! + 3. If you are creating a new annotation, also add a pipeline annotation that uses your + primitive inside the pipelines folder, named exactly like your primitive, and test it + with the command + ``mlprimitives test pipelines/mlprimitives.candidates.your_module.YourPrimitive.json``. + If adding a pipeline is not possible for any reason, please inform the maintainers, as + this probably means that a new dataset needs to be added. + 4. Make sure that all the annotations that existed before that use the same primitive still + work by testing their corresponding pipelines with the command above. + +5. Review your changes and make sure that everything continues to work properly by executing the + ``make test-all`` command. +6. Push all your changes to GitHub and open a Pull Request, indicating in the description which + issue you are resolving and what the changes consist of. diff --git a/docs/community/welcome.rst b/docs/community/welcome.rst new file mode 100644 index 00000000..58e3886b --- /dev/null +++ b/docs/community/welcome.rst @@ -0,0 +1,87 @@ +Welcome to the Community +======================== + +MLPrimitive library is an open source compendium of all the possible data transforms +that are used by machine learning practitioners. + +It is a community driven effort, so it relies on the community. For this reason, we designed it +thoughtfully so much of the contributions here can have shelf life greater than any of the +machine learning libraries it integrates, as it represents the combined knowledge of all the +contributors and allows many different systems to be built using the annotations themselves. + +So, are you ready to join the community? If so, please feel welcome and keep reading! + +Types of contributions +---------------------- + +There are several ways to contribute to a project like **MLPrimitives**, and they do not always +involve coding. + +If you want to contribute but do not know where to start, consider one of the following options: + +Reporting Issues +~~~~~~~~~~~~~~~~ + +If there is something that you would like to see changed in the project, or that you just want +to ask, please create an issue at https://github.com/HDI-Project/MLPrimitives/issues + +If you do so, please: + +* Explain in detail what you are requesting. +* Keep the scope as narrow as possible, to make it easier to implement or respond. +* Remember that this is a volunteer-driven project and that the maintainers will attend every + request as soon as possible, but that in some cases this might take some time. + +Below there are some examples of the types of issues that you might want to create. + +Request new primitives +********************** + +Sometimes you will feel that a necessary primitive is missing and should be added. + +In this case, please create an issue indicating the name of the primitive and a link to +its documentation. + +If the primitive documentation is unclear or not precise enough to know what needs to be +done only by reading it, please add as many details as necessary in the issue description. + +Request new features +******************** + +If there is any other feature that you would like to see implemented, such as adding new +functionalities to the existing custom primitives, or changing their behavior to cover +a broader range of cases, you can also create an issue. + +If you do so, please indicate all the details about what you request as well as some use +cases of the new feature. + +Report Bugs +*********** + +If you find something that fails, please report it including: + +* Your operating system name and version. +* Any details about your local setup that might be helpful in troubleshooting. +* Detailed steps to reproduce the bug. + +Ask for Documentation +********************* + +If there is something that is not documented well enough, do not hesitate to point at that +in a new issue and request the necessary changes. + +Write Documentation +~~~~~~~~~~~~~~~~~~~ + +MLPrimitives could always use more documentation, whether as part of the official MLPrimitives +docs, in docstrings, or even on the web in blog posts, articles, and such, so feel free to +contribute any changes that you deem necessary, from fixing a simple typo, to writing whole +new pages of documentation. + +Contribute code +~~~~~~~~~~~~~~~ + +Obviously, the main element in the MLPrimitives library is the code. + +If you are willing to contribute to it, please head for the next sections for detailed guidelines +about how to do so. diff --git a/docs/conf.py b/docs/conf.py index 88fdc4a3..5296ad3f 100755 --- a/docs/conf.py +++ b/docs/conf.py @@ -1,7 +1,7 @@ #!/usr/bin/env python # -*- coding: utf-8 -*- # -# MLBlocks documentation build configuration file, created by +# MLPrimitives documentation build configuration file, created by # sphinx-quickstart on Fri Jun 9 13:47:02 2017. # # This file is execfile()d with the current directory set to its @@ -18,15 +18,10 @@ # relative to the documentation root, use os.path.abspath to make it # absolute, like shown here. -# import os -# import sys - import sphinx_rtd_theme # For read the docs theme from recommonmark.parser import CommonMarkParser # from recommonmark.transform import AutoStructify -# sys.path.insert(0, os.path.abspath('..')) - import mlprimitives # -- General configuration --------------------------------------------- @@ -44,18 +39,19 @@ 'sphinx.ext.viewcode', 'sphinx.ext.napoleon', # 'sphinx.ext.graphviz', - # 'IPython.sphinxext.ipython_console_highlighting', - # 'IPython.sphinxext.ipython_directive', + 'IPython.sphinxext.ipython_console_highlighting', + 'IPython.sphinxext.ipython_directive', + # 'sphinx.ext.autosectionlabel', ] -# ipython_execlines = ["import pandas as pd", "pd.set_option('display.width', 1000000)"] +ipython_execlines = ["import pandas as pd", "pd.set_option('display.width', 1000000)"] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: -source_suffix = ['.rst', '.md'] #, '.ipynb'] +source_suffix = ['.rst', '.md', '.ipynb'] # source_parsers = { # '.md': CommonMarkParser, diff --git a/docs/contributing.rst b/docs/contributing.rst deleted file mode 100644 index e582053e..00000000 --- a/docs/contributing.rst +++ /dev/null @@ -1 +0,0 @@ -.. include:: ../CONTRIBUTING.rst diff --git a/docs/getting_started/concepts.rst b/docs/getting_started/concepts.rst new file mode 100644 index 00000000..ddeeab41 --- /dev/null +++ b/docs/getting_started/concepts.rst @@ -0,0 +1,166 @@ +.. _concepts: + +Basic Concepts +============== + +Before diving into advanced usage and contributions, let's review the basic concept of the +library to help you get started. + +What is a primitive? +-------------------- + +A primitive is a data processing block. Along with a code that does the processing on the data, +a primitive also has an associated JSON file that has a number of annotations. These annotations +help automated algorithms to interpret the primitive, and data scientists to construct machine +learning pipelines with proper provenance and full transparency about each individual components. + +Types of Primitives +------------------- + +Not all primitives are the same, so in the following sections we review which types of +primitives there are. + +Function Primitives +~~~~~~~~~~~~~~~~~~~ + +The most simple type of primitives are simple functions that can be called directly, without +the need to created any class instance before. + +In most cases, if not all, these functions do not have any associated learning process, and their +behavior is always the same both during the fitting and the predicting phases of the pipeline. + +A simple example of such a primitive would be the ``numpy.argmax`` function, which expects a 2 +dimensional array as input, and returns a 1 dimensional array that indicates the index of +the maximum values along an axis. + +Class Primitives +~~~~~~~~~~~~~~~~ + +A more complex type of primitives are classes which need to be instantiated before they can be +used. + +In most cases, these classes will have an associated learning process, and they will have some +fit method or equivalent that will be called during the fitting phase but not during the +predicting one. + +A simple example of such a primitive would be the ``sklearn.preprocessing.StandardScaler`` class, +which is used to standardize a set of values by calculating their z-score, which means centering +them around 0 and scaling them to unit variance. + +This primitive has an associated learning process, where it calculates the mean and standard +deviation of the training data, to later on use them to transform the prediction data to the same +center and scale. + +Types of Integrations +--------------------- + +Also, primitives can be classified depending on how they are integrated into the project. + +Directly integrable primitives +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some libraries already follow the `fit-produce` abstraction. That is, they already have several +data processing building blocks that have this abstraction. A good example of these type of +primitives are most of the estimators from the scikit-learn library. These building blocks usually +have these characteristics: + +* Tunable hyperparameters are simple values of the supported basic types: + * ``str`` + * ``bool`` + * ``int`` + * ``float`` +* Creating the class instance or calling the fit or produce methods does not require building + any complex structure before the call is made. +* The fitting and predicting phase consist of a single method or function call each. + +In this case, no additional code is necessary to adapt them and those blocks can be brought into +MLPrimitives using nothing else than a single JSON annotation file, which can be found in the +`mlprimitives/jsons folder`_. + +Examples +******** + +* `numpy.argmax`_ +* `sklearn.preprocessing.StandardScaler`_ +* `xgboost.XGBClassifier`_ + +.. note:: If the code is directly usable then why create a JSON annotation file? While the code is + directly usable, most building blocks do not have an associated metadata we need for + automation. Usually when using scikit-learn for example, a data scientist goes through + the documentation to understand different hyperparameters, their ranges and has to do a + lot of manual inference before they can use them. + + +Primitives that require a Python adapter +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The second type of primitives are the ones that need some kind of adaptation process to fit to our +API, but whose behaviour is not altered in any way by this process. The type of primitives that +are integrated in this way are the ones that have some of these characteristics: + +* Need some additional steps after the instantiation in order to be prepared to run. +* The tunable hyperparameters need some kind of transformation or instantiation before they can be + passed to the primitive. +* The primitive cannot be directly applied to the inputs or the outputs, we support, and need to + be manipulated in some way before they can be passed to any other primitive. + +Some examples of these primitives are the Keras models, which need to be built in several steps +and later on compiled before they can be used, or some image transformation primitives which need +to be applied to the images one by one. These primitives consist of some Python code which can be +found in the ``mlprimitives.adapters`` module, as well as JSON annotations that point at the +corresponding functions or classes, which can also be found in the `mlprimitives/jsons folder`_. + +Examples +******** + +* LightFM +* Keras Sequential LSTMTextClassifier +* NetworkX Graph Feature Extraction + + +Custom primitives +~~~~~~~~~~~~~~~~~ + +The third type are custom primitives implemented specifically for this library. These custom +primitives may be implemented from scratch or they may be using third party tools in such a way +as to alter the third party tool’s native behavior to add new functionalities. + +This type of primitives consist of Python code that can be found inside the `mlprimitives/custom module`_, +as well as the corresponding JSON annotations, which can also be found in the `mlprimitives/jsons folder`_. + +Examples +******** + +* Preprocessing Class Encoder +* Vocabulary Counter +* Text Cleaner + + +Candidate primitives +******************** + +Since this is a project with a strong focus in community contributions, we want to make it easy +for everyone to contribute their own code without the need to have project maintainers that +carefully and thoroughly review all the new contributions, as this would make the contributing +process very slow. However, having all the new primitives accepted and merged without a proper +review, might compromise the project stability in some cases. + +For this reason, we have created the special `mlprimitives/candidates module`_, which includes +all the primitives that have been recently contributed but haven't gone through a proper testing +and review yet. + +So, does this it mean that these primitives do not work? Not at all! + +All the candidate primitives have gone through an initial testing and review process before being +accepted, so they are always proved to work. The only difference between these primitives and +the ones that you can find in `mlprimitives/custom module`_ is that the later ones have gone +through a deeper code review in search of possible improvements in terms of performance and +functionality refinements + + +.. _mlprimitives/jsons folder: https://github.com/HDI-Project/MLPrimitives/blob/master/mlprimitives/jsons +.. _mlprimitives/custom module: https://github.com/HDI-Project/MLPrimitives/blob/master/mlprimitives/custom +.. _mlprimitives/candidates module: https://github.com/HDI-Project/MLPrimitives/blob/master/mlprimitives/candidates +.. _numpy.argmax: https://github.com/HDI-Project/MLPrimitives/blob/master/mlprimitives/jsons/numpy.argmax.json +.. _sklearn.preprocessing.StandardScaler: https://github.com/HDI-Project/MLPrimitives/blob/master/mlprimitives/jsons/sklearn.preprocessing.StandardScaler.json +.. _xgboost.XGBClassifier: https://github.com/HDI-Project/MLPrimitives/blob/master/mlprimitives/jsons/xgboost.XGBClassifier.json diff --git a/docs/installation.rst b/docs/getting_started/install.rst similarity index 54% rename from docs/installation.rst rename to docs/getting_started/install.rst index 3bd4210e..bf1b5cb7 100644 --- a/docs/installation.rst +++ b/docs/getting_started/install.rst @@ -1,10 +1,8 @@ .. highlight:: shell -============ Installation ============ - Stable release -------------- @@ -23,30 +21,51 @@ you through the process. .. _pip: https://pip.pypa.io .. _Python installation guide: http://docs.python-guide.org/en/latest/starting/installation/ - From sources ------------ The sources for MLPrimitives can be downloaded from the `Github repo`_. -You can either clone the public repository: +You can either clone the ``stable`` branch form the public repository: .. code-block:: console - $ git clone git://github.com/HDI-Project/MLPrimitives + $ git clone --branch stable git://github.com/HDI-Project/MLPrimitives Or download the `tarball`_: .. code-block:: console - $ curl -OL https://github.com/HDI-Project/MLPrimitives/tarball/master + $ curl -OL https://github.com/HDI-Project/MLPrimitives/tarball/stable -Once you have a copy of the source, you can install it with: +Once you have a copy of the source, you can install it with this command: .. code-block:: console - $ make install-develop + $ make install + +.. _development: + +Development Setup +----------------- + +If you want to make changes in `MLPrimitives` and contribute them, you will need to prepare +your environment to do so. +These are the required steps: + +1. Fork the MLPrimitives `Github repo`_. + +2. Clone your fork locally:: + + $ git clone git@github.com:your_name_here/MLPrimitives.git + +3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, + this is how you set up your fork for local development:: + + $ mkvirtualenv MLPrimitives + $ cd MLPrimitives/ + $ make install-develop .. _Github repo: https://github.com/HDI-Project/MLPrimitives -.. _tarball: https://github.com/HDI-Project/MLPrimitives/tarball/master +.. _tarball: https://github.com/HDI-Project/MLPrimitives/tarball/stable diff --git a/docs/getting_started/quickstart.rst b/docs/getting_started/quickstart.rst new file mode 100644 index 00000000..01c3da40 --- /dev/null +++ b/docs/getting_started/quickstart.rst @@ -0,0 +1,114 @@ +Quickstart +========== + +Below is a short tutorial that will show you how to get started using MLPrimitives with `MLBlocks`_. + +In this tutorial we will learn how to: + +* Create a pipeline using multiple primitives +* Obtain the list of tunable hyperparameters from the pipeline +* Specify hyperparameters for each primitive in the pipeline +* Fit the pipeline using training data +* Use the pipeline to make predictions from new data + +Creating a pipeline +------------------- + +With MLBlocks, creating a pipeline is as simple as specifying a list of MLPrimitives and passing +them to the ``MLPipeline``: + +.. ipython:: python + + from mlblocks import MLPipeline + primitives = [ + 'mlprimitives.custom.feature_extraction.StringVectorizer', + 'sklearn.ensemble.RandomForestClassifier', + ] + pipeline = MLPipeline(primitives) + +Optionally, specific hyperparameters can be also set by specifying them in a dictionary: + +.. ipython:: python + + hyperparameters = { + 'sklearn.ensemble.RandomForestClassifier': { + 'n_estimators': 100 + } + } + pipeline = MLPipeline(primitives, hyperparameters) + +Once the pipeline has been instantiated, we can easily see what hyperparameters have been set +for each block, by calling the ``get_hyperparameters``. + +The output of this method is a dictionary which has the name of each block as keys and +a dictionary with the hyperparameters of the corresponding block as values. + +.. ipython:: python + + pipeline.get_hyperparameters() + +Tunable Hyperparameters +----------------------- + +One of the main features of MLPrimitives is the possibility to indicate the type and possible +values that each primitive hyperparameter accepts. + +The list of possible hyperparameters and their details can easily be obtained from the pipeline +instance by calling its ``get_tunable_hyperparameters``. + +The output of this method is a dictionary that contains the list of tunable hyperparameters +for each block in the pipeline, ready to be passed to any hyperparameter tuning library such +as `BTB`_. + +.. ipython:: python + + pipeline.get_tunable_hyperparameters() + +Setting Hyperparameters +----------------------- + +Modifying the hyperparameters of an already instantiated pipeline can be done using the +``set_hyperparameters method``, which expects a dictionary with the same format as the returned +by the ``get_hyperparameters method``. + +Note that if a subset of the hyperparameters is passed, only these will be modified, and the +other ones will remain unmodified. + +.. ipython:: python + + new_hyperparameters = { + 'sklearn.ensemble.RandomForestClassifier#1': { + 'max_depth': 15 + } + } + pipeline.set_hyperparameters(new_hyperparameters) + hyperparameters = pipeline.get_hyperparameters() + hyperparameters['sklearn.ensemble.RandomForestClassifier#1']['max_depth'] + +Making predictions +------------------ + +Once we have created the pipeline with the desired hyperparameters we can fit it +and then use it to make predictions on new data. + +To do this, we first call the ``fit`` method passing the training data and the corresponding +labels. + +.. ipython:: python + + from mlblocks.datasets import load_personae + dataset = load_personae() + X_train, X_test, y_train, y_test = dataset.get_splits(1) + pipeline.fit(X_train, y_train) + +Once we have fitted our model to our data, we can call the ``predict`` method passing new data +to obtain predictions from the pipeline. + +.. ipython:: python + + predictions = pipeline.predict(X_test) + predictions + dataset.score(y_test, predictions) + +.. _MLBlocks: https://github.com/HDI-Project/MLBlocks +.. _BTB: https://github.com/HDI-Project/BTB diff --git a/docs/images/dai-logo.png b/docs/images/dai-logo.png new file mode 100644 index 00000000..4abe1184 Binary files /dev/null and b/docs/images/dai-logo.png differ diff --git a/docs/index.rst b/docs/index.rst index 94c1630d..7cf5c745 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,24 +1,53 @@ -.. mdinclude:: ../README.md +Welcome to MLPrimitives! +======================== + +.. figure:: images/dai-logo.png + :width: 300 px + :alt: DAI-Lab Logo + + An open source project from Data to AI Lab at MIT. + +Overview +-------- + +This repository contains primitive annotations to be used by the MLBlocks library, as well as +the necessary Python code to make some of them fully compatible with the MLBlocks API requirements. +There is also a collection of custom primitives contributed directly to this library, which either +combine third party tools or implement new functionalities from scratch. + +Why did we create this library? +------------------------------- + +* Too many libraries in a fast growing field +* Huge societal need to build machine learning apps +* Domain expertise resides at several places (knowledge of math) +* No documented information about hyperparameters, behavior... + .. toctree:: - :hidden: - :titlesonly: + :caption: Getting Started + :maxdepth: 2 - Overview - installation - usage + Welcome + getting_started/install + getting_started/quickstart + getting_started/concepts .. toctree:: - :caption: Advanced Usage - :hidden: + :caption: Community + :maxdepth: 2 - API Reference + Community + Contributing + Annotations + Adapters + Custom Primitives .. toctree:: - :caption: Development Notes + :caption: Resources :hidden: - contributing + API Reference authors history diff --git a/docs/usage.rst b/docs/usage.rst deleted file mode 100644 index 2a2a918a..00000000 --- a/docs/usage.rst +++ /dev/null @@ -1,7 +0,0 @@ -===== -Usage -===== - -To use MLPrimitives in a project:: - - import mlprimitives diff --git a/setup.py b/setup.py index 25c95b91..a4f74b15 100644 --- a/setup.py +++ b/setup.py @@ -59,6 +59,7 @@ 'Sphinx>=1.7.1', 'sphinx_rtd_theme>=0.2.4', 'recommonmark>=0.4.0', + 'ipython==6.5.0', # style check 'flake8>=3.5.0',