Databeer

Introduction

The databeer projects aims at extracting intelligence from the hundreds of thousands beer recipes accessible online. The first steps, which we are currently working on, are data crawling and data modelling. Then, we'll move on to the machine learning part.

1. Crawling

We use scrapy for web crawling, a powerful and versatile web crawler framework for Python.

1.1. brewtoad.com

The first source we crawled. Contains approximately 300 000 recipes. Scrapy files can be found in databeer/brewtoad. The data is then written in csv files, which you can found in databeer/brewtoad/csv.

1.2. beersmith.com

TBD

2. Structuring data

This part is still at an early phase, and most notebooks are not fully commented or For the time being, these are mostly sandboxes to play with the data and find ideas for further applications.

2.1. utils.py

In order to lighten the notebooks, some functions are defined in the utils.py file. This file is also a work in progress and will be refactored in the future.

2.2. Sandbox notebook

Various aggregations and other tests on the data.

2.3. Hops Studies notebook

An attempt to focus on Hops data.

3. Machine learning applications

Steps 1 and 2 gave us some ideas about what we'd like to obtain and how we might be able to do so. Here are some examples:

From the hops (defined by time, alpha and relative quantity) sequence of each recipe, train a Recurrent Neural Network (RNN) to suggest an additional hop given a list of hops. This might also be done with Hidden Markov Model (HMM)
Same as hop but for fermentables
Same but this time for full recipe (hasn't been thought of in details for know, might not be ideal)
IBU calculation -> surpass the approximative formula used by most brewers

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
brewtoad		brewtoad
.gitignore		.gitignore
Hops Studies.ipynb		Hops Studies.ipynb
README.md		README.md
Sandbox.ipynb		Sandbox.ipynb
scrapy.cfg		scrapy.cfg
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Databeer

Introduction

1. Crawling

1.1. brewtoad.com

1.2. beersmith.com

2. Structuring data

2.1. utils.py

2.2. Sandbox notebook

2.3. Hops Studies notebook

3. Machine learning applications

About

Releases

Packages

Languages

UlysseH/databeer

Folders and files

Latest commit

History

Repository files navigation

Databeer

Introduction

1. Crawling

1.1. brewtoad.com

1.2. beersmith.com

2. Structuring data

2.1. utils.py

2.2. Sandbox notebook

2.3. Hops Studies notebook

3. Machine learning applications

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages