Faculty Advisor - Dr. Babatunde Olubando
Contact info: [email protected] / Slack
**
Write your own K Nearest Neighbor Classifier to predict breast cancer**
- Download the wdbc.data and wdbc.names files
- Read and store the contents of the wdbc.data file. The only fields (columns) that will be used in the classifier are radius and smoothness. Split its contents into two datasets, 70% for training and 30% for testing.
- Use the Lab 1 code as a guide to build your own Nearest Neighbor classifier (K = 1).
- Display the percent error after classifying the test data
- Loop the code so that you display classification error for K = 1..10
- For K = 2, produce a graph to show the data from the two classes (radius vs compactness), and the classification of the test data (see figure)
Percent error (K=1): 16.3317%
Percent error (K=2): 12.8141%
Percent error (K=3): 14.8241%
Percent error (K=4): 13.3166%
Percent error (K=5): 14.3216%
Percent error (K=6): 14.5729%
Percent error (K=7): 14.3216%
Percent error (K=8): 13.3166%
Percent error (K=9): 14.8241%
Percent error (K=10): 13.5678%
>>>>> gd2md-html alert: inline image link here (to images/image1.png). Store image on your image server and adjust path/filename/extension if necessary.
(Back to top)(Next alert)
>>>>>
X-axis is Radius, and Y-axis is Compactness. Red circles are Benign and Blue Malignant. Red crosses are Benign predictions and Blue crosses are Malignant predictions. If circles and crosses agree in color that’s a correct prediction, if not, its an incorrect prediction.