Skip to content

Commit

Permalink
Refactoring posts
Browse files Browse the repository at this point in the history
  • Loading branch information
mdyzma committed Jan 7, 2018
1 parent 9c30073 commit 1b6cbc9
Show file tree
Hide file tree
Showing 32 changed files with 712 additions and 30 deletions.
2 changes: 1 addition & 1 deletion Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ DEPENDENCIES
tzinfo-data

RUBY VERSION
ruby 2.4.2p198
ruby 2.3.3p222

BUNDLED WITH
1.16.1
171 changes: 171 additions & 0 deletions _drafts/2017-06-11-iris-nb-click.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
---
layout: post
author: Michal Dyzma
title: Naive Bayes classifier for Iris Data Set
date: 2017-06-12 14:53:32 +0200
comments: true
mathjakx: false
categories: python naive-bayes machine-learning
keywords: python, naive-bayes, machine-learning
---
<!--
![banner][banner] -->
<br>
Beginning of my Machine Learning practical adventure. I intend to learn through practice. Language I chose is Python. My learning sessions will comprise of view repeatable exercises building classical data science pipeline. For this session I chose famous [__Iris Data Set__](https://archive.ics.uci.edu/ml/datasets/iris) to predict the flower class based on given attributes. Algorithm will be __Naive Bayes classifier__. When launched, command line interface will accept four numbers as an input (Petal Length, Petal Width, Sepal Length, Sepal width). Based on given numbers it will use trained model to classify unknown Iris to one of the species: _Iris setosa_, _Iris virginica_ or _Iris versicolor_.

<br>
{% include note.html content="Source code from the article can be downloaded from this [GitHub repository](https://github.com/mdyzma/irispy)" %}

This is first of many sessions, which goal is to get familiar with machine learning methods and train how to produce additional value from raw data. Each learning session will comprise of four basic exercises:

1. Find data set
2. Clean the data
3. Choose and tune algorithm/algorithms
4. Visualize data

Sometimes I will use previously learned algorithm to do some benchmarks and compare their performance on different data sets.

## Naive Bayes

“Support Vector Machine” (SVM) is a supervised machine learning algorithm which can be used for both classification or regression challenges. However, it is mostly used in classification problems. In this algorithm, we plot each data item as a point in n-dimensional space (where n is number of features you have) with the value of each feature being the value of a particular coordinate. Then, we perform classification by finding the hyper-plane that differentiate the two classes very well (look at the below snapshot).

## Project structure

Basic project structure is:

{% highlight bash %}
.
├── .gitignore
├── features
│   ├── environment.py
│   ├── iris.feature
│   └── steps
│   └── iris_steps.py
├── irisvmpy
│   ├── __init__.py
│   ├── iris.py
│   └── test_iris.py
├── LICENSE
└── setup.py
{% endhighlight %}

## Setting pipeline







## Unit an acceptance tests

{% highlight bash %}
.
├── features
│   ├── environment.py
│   ├── iris.feature
│   └── steps
│   └── iris_steps.py
├── irisvmpy
│   ├── __init__.py
│   ├── iris.py
│   └── test_iris.py
...
{% endhighlight %}


## Command line interface



<br>
__irisvmp/iris.py__
{% highlight python %}
import click

@click.command()
@click.option('--petal-lenght', prompt='Petal Lenght',
help='Unknown Iris Petal Lenght.', type=float)
@click.option('--petal-width', prompt='Petal Lenght',
help='Unknown Iris Petal Width.', type=float)
@click.option('--sepal-lenght', prompt='Petal Lenght',
help='Unknown Iris Sepal Lenght.', type=float)
@click.option('--sepal-width', prompt='Petal Lenght',
help='Unknown Iris Sepal Width.', type=float)
def cli(petal_lenght, petal_width, sepal_lenght, sepal_width):
click.echo("Iris Flower classifier\n")
click.echo("\nCalculating result...")
time.sleep(1)
click.echo()
click.echo("Your Petal Lenght is: {}".format(petal_lenght))
click.echo("Your Petal Width is: {}".format(petal_width))
click.echo("Your Sepal Lenght is: {}".format(sepal_lenght))
click.echo("Your Sepal Width is: {}".format(sepal_width))
click.echo()
click.echo("Your flower seems to be fine representant of:")
click.secho("{}".format(species), fg='green', bold=True)
# (Petal Length , Petal Width , Sepal Length , Sepal width

if __name__ == "__main__":
cli()
{% endhighlight %}



## Packaging


__setu.py__
{% highlight python %}
import codecs
try:
codecs.lookup('mbcs')
except LookupError:
ascii = codecs.lookup('ascii')
func = lambda name, enc=ascii: {True: enc}.get(name=='mbcs')
codecs.register(func)

from setuptools import setup, find_packages


requirements = [
'scipy', 'numpy', 'scikit-learn', 'Click'
]

test_requirements=[
'behave'
]

setup(
name='irisvmpy',
version='0.0.1',
description='SVM classifier for iris data-set',
author='Michal Dyzma',
author_email='[email protected]',
license='MIT',
packages=find_packages(),
install_requires=requirements,
entry_points={
'console_scripts': [
'irisvmpy = irisvmpy.iris:cli',
],
},
classifiers=[
'Development Status :: 1 - Alpha',
'License :: OSI Approved :: MIT License',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.6',
],
zip_safe=False
)
{% endhighlight %}


<br>
{% include note.html content="Source code from the article can be downloaded from this [GitHub repository](https://github.com/mdyzma/irispy)" %}


<!-- Images -->

[banner]: /assets/2017-05-12/banner.jpg
<!-- [iris_cli]: /assets/2017-05-12/iris_cli.png -->
170 changes: 170 additions & 0 deletions _drafts/2017-06-12-iris-lr-click.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
---
layout: post
author: Michal Dyzma
title: Logistic regression and Iris Data Set
date: 2017-06-12 20:44:01 +0200
comments: true
mathjakx: false
categories: python logistic-regression machine-learning
keywords: python, logistic-regression, machine-learning
---

<!-- ![banner][banner] -->
<br>
Beginning of my Machine Learning practical adventure. I intend to learn through practice. Language I chose is Python. My learning sessions will comprise of view repeatable exercises building classical data science pipeline. For this session I chose famous [__Iris Data Set__](https://archive.ics.uci.edu/ml/datasets/iris) to predict the flower class based on given attributes. Algorithm will be __Logistic regression__ classifier. When launched, command line interface will accept four numbers as an input (Petal Length, Petal Width, Sepal Length, Sepal width). Based on given numbers it will use trained model to classify unknown Iris to one of the species: _Iris setosa_, _Iris virginica_ or _Iris versicolor_.

<br>
{% include note.html content="Source code from the article can be downloaded from this [GitHub repository](https://github.com/mdyzma/irispy)" %}

This is first of many sessions, which goal is to get familiar with machine learning methods and train how to produce additional value from raw data. Each learning session will comprise of four basic exercises:

1. Find data set
2. Clean the data
3. Choose and tune algorithm/algorithms
4. Visualize data

Sometimes I will use previously learned algorithm to do some benchmarks and compare their performance on different data sets.

## Logistic regression

“Support Vector Machine” (SVM) is a supervised machine learning algorithm which can be used for both classification or regression challenges. However, it is mostly used in classification problems. In this algorithm, we plot each data item as a point in n-dimensional space (where n is number of features you have) with the value of each feature being the value of a particular coordinate. Then, we perform classification by finding the hyper-plane that differentiate the two classes very well (look at the below snapshot).

## Project structure

Basic project structure is:

{% highlight bash %}
.
├── .gitignore
├── features
│   ├── environment.py
│   ├── iris.feature
│   └── steps
│   └── iris_steps.py
├── irisvmpy
│   ├── __init__.py
│   ├── iris.py
│   └── test_iris.py
├── LICENSE
└── setup.py
{% endhighlight %}

## Setting pipeline







## Unit an acceptance tests

{% highlight bash %}
.
├── features
│   ├── environment.py
│   ├── iris.feature
│   └── steps
│   └── iris_steps.py
├── irisvmpy
│   ├── __init__.py
│   ├── iris.py
│   └── test_iris.py
...
{% endhighlight %}


## Command line interface



<br>
__irisvmp/iris.py__
{% highlight python %}
import click

@click.command()
@click.option('--petal-lenght', prompt='Petal Lenght',
help='Unknown Iris Petal Lenght.', type=float)
@click.option('--petal-width', prompt='Petal Lenght',
help='Unknown Iris Petal Width.', type=float)
@click.option('--sepal-lenght', prompt='Petal Lenght',
help='Unknown Iris Sepal Lenght.', type=float)
@click.option('--sepal-width', prompt='Petal Lenght',
help='Unknown Iris Sepal Width.', type=float)
def cli(petal_lenght, petal_width, sepal_lenght, sepal_width):
click.echo("Iris Flower classifier\n")
click.echo("\nCalculating result...")
time.sleep(1)
click.echo()
click.echo("Your Petal Lenght is: {}".format(petal_lenght))
click.echo("Your Petal Width is: {}".format(petal_width))
click.echo("Your Sepal Lenght is: {}".format(sepal_lenght))
click.echo("Your Sepal Width is: {}".format(sepal_width))
click.echo()
click.echo("Your flower seems to be fine representant of:")
click.secho("{}".format(species), fg='green', bold=True)
# (Petal Length , Petal Width , Sepal Length , Sepal width

if __name__ == "__main__":
cli()
{% endhighlight %}


## Packaging


__setu.py__
{% highlight python %}
import codecs
try:
codecs.lookup('mbcs')
except LookupError:
ascii = codecs.lookup('ascii')
func = lambda name, enc=ascii: {True: enc}.get(name=='mbcs')
codecs.register(func)

from setuptools import setup, find_packages


requirements = [
'scipy', 'numpy', 'scikit-learn', 'Click'
]

test_requirements=[
'behave'
]

setup(
name='irisvmpy',
version='0.0.1',
description='SVM classifier for iris data-set',
author='Michal Dyzma',
author_email='[email protected]',
license='MIT',
packages=find_packages(),
install_requires=requirements,
entry_points={
'console_scripts': [
'irisvmpy = irisvmpy.iris:cli',
],
},
classifiers=[
'Development Status :: 1 - Alpha',
'License :: OSI Approved :: MIT License',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.6',
],
zip_safe=False
)
{% endhighlight %}


<br>
{% include note.html content="Source code from the article can be downloaded from this [GitHub repository](https://github.com/mdyzma/irispy)" %}


<!-- Images -->

[banner]: /assets/2017-05-12/banner.jpg
<!-- [iris_cli]: /assets/2017-05-12/iris_cli.png -->
Loading

0 comments on commit 1b6cbc9

Please sign in to comment.