Skip to content

Latest commit

 

History

History
13 lines (12 loc) · 971 Bytes

README.md

File metadata and controls

13 lines (12 loc) · 971 Bytes

CART

Implementation of Classification and Regression Tree (CART) algorithm with training a decision tree for classification.

  • Dataset
    • The Red Wine Quality Data Set with 1599 samples.
    • Each sample has 11 features (attributes) and a quality score (QS) ranging from 0 to 10
    • https://archive.ics.uci.edu/ml/datasets/wine+quality
    • the samples are labelled into 3 different classes: ”good/2” (QS>6), ”normal/1” (QS=6), and ”bad/0” (QS<6) based on their quality scores
    • the data set is partitioned into a training set and a testing set including 1000 samples and 599 samples respectively
  • Requirements
    • Each tree node should have either zero or two children nodes (binary split). Each node should be split based on one and only one feature.
    • Use gini impurity as the measure when making split decisions.
    • Implement 5-fold cross validation and grid search to find the best max_depth which is the only one parameter to be searched for.