Skip to content

Latest commit

 

History

History
92 lines (59 loc) · 3.2 KB

decision-trees.md

File metadata and controls

92 lines (59 loc) · 3.2 KB

Decision Trees

Back to Index


Objective

Learn Decision Tree algorithms

Essentials Reading

Understanding Classifications

Read the basics of classifications

Understanding Regressions

Read the basics of regressions

Decision Trees

Implementing Decision Trees in Scikit-Learn

Extra Reading

Knowledge Check

  • What problem can decision trees solve? Classification, regression, both?
  • What are the strengths and weaknesses of Decision Trees?
  • What is 'greedy algorithm'?
  • How can we stop the tree from further dividing?
  • Name some 'stopping criteria' to stop tree dividing further
  • What is 'pruning'?
  • What is a Gini index

Exercises

Difficulty Level

★☆☆ - Easy
★★☆ - Medium
★★★ - Challenging
★★★★ - Bonus

EX-1: DT Classification - Synthetic data (★☆☆)

Use Scikit's make_blobs or make_classification to generate some sample data.

Try to separate them using DT

EX-2: DT Classification (★★☆)

  • Here is Bank marketing dataset
  • You may want to encode variables
  • Use DT to predict yes/no binary decision
  • Visualize the tree
  • Create a confusion matrix
  • What is the accuracy of the model
  • Run Cross Validation to gauge the accuracy of this model

EX-3: DT Regression - Synthetic data (★☆☆)

Use Scikit's make_regression to generate some sample data.

Use DTRegressor to solve this

EX-4: DT Regression (★★☆)

  • Use Bike sharing data
  • Use DTRegressor to predict bike demand
  • Visualize the tree
  • Use RMSE, R2 to evaluate the model
  • Use Cross Validation to thoroughly test the model performance

More Exercises