CST383 - Week 7
- YZ

- Aug 29, 2021
- 1 min read

This week we started learning about decision trees. Trees can perform classification through the use of recursive partitioning. Decision trees are easy to use, have high interpretability, and work well with both numeric and categorical data. However, they tend to overfit and are sensitive to training data. We are aiming for the peak of the validation curve so we can get a balance of bias and variance so that our model isn't too simple but also isn't too sensitive to the training data. The Gini index shows the purity of the classification at each node in the tree. When the purity improvement begins to be very small, that is a good sign to stop splitting the tree. Trees can also perform regression. The goal is to use the split that will give the biggest reduction in the MSE. Lastly, we learned about system design.
We completed 3 labs relating to decision trees as well as 2 homework assignments reinforcing last week's material, linear regression.

Comments