Decision Tree

ml ml supervised classification regression decision-tree 1 min read

Recursive binary splitting for classification and regression with pruning

A supervised learning method used for regression and classification.

Terminology

For Classification:

Gini Index: $\sum_{k=1}^K \hat{p}_k(1-\hat{p}_k)$ — Range: [0, 0.5] for binary
Entropy: $-\sum_{k=1}^K \hat{p}_k\log(\hat{p}_k)$ — Range: [0, log(k)]
Misclassification Error: $1 - \max(\hat{p}_i)$

For Regression:

Uses top-down, greedy approach (binary splitting). Information gain:

\[IG = I(parent) - \sum_{j=1}^m \frac{N_j}{N}I(j)\]

Stopping Criteria:

Cost complexity function:

\[\sum_{m=1}^{|T|} \sum_{i:x_i \in R_m} (y_i - \hat{y}_{R_m})^2 + \alpha |T|\]

where $

$ is tree size and $\alpha$ is complexity parameter.

Advantages: Interpretable, handles numerical/categorical data, minimal preprocessing, captures non-linear relationships.

Disadvantages: High variance (unstable), prone to overfitting, biased towards dominant classes.