Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Deciesion Tree is the foundation of the random forest.

A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains conditional control statements.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Base on the history samples and experiences, we can build different decision tree(model), we defined a method to evaluate the performance of the decision tree.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Decision boundary of Decision Tree

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

The structure of the decision is complex. Structured Prediction

NP-hard problem

We can't solve a problem within the polynomial complexity called NP-hard

We use the approximate approach to solve the problem such as greedy algorithm.

Information Gain

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Type 1 of Decision Tree

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Type two of the decision tree

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

we change the order of the two features to obtain a new decision tree.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

if the dimension of the input data is large, the number associate decision trees is very large!

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

 

We use the greedy algorithm to obtain the decision tree model.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

The first step is to choose the root

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

how to choose the features?

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Information entropy

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

 

How to present the reduce of the uncertainty(entropy)?, we use the old uncertainty and subtract with the current uncertainty(entropy)

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

A prograss to build the decision tree

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Calculte the information gain for feature 'time'

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Calculate the information gain for feature 'Match type'

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

calculate the information gain for feature 'court surface'

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

calculate the information gain for 'best effort'

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

we choose the feature which has the max information gain as the root!

in the situation above, we choose the 'cour surface'

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

for clay we need to do a further split.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

we obtain the last decision tree.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Prevent the decision tree from overfitting

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

We try to let the decision tree become simple. such as reduce the number of leaves

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

The more number of leaves, the decision tree is more complex

Sometimes we try to constraint the depth of the decision tree to control the number of leaves.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

How to deal with continues-value features?

for discrete feature, we just build a branch for each value of the current feature.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

we calculate the information gain for each split rule and choose the best(information gain max)

We can reuse the continues-value feature in the subtree

Decision tree for the regression problem

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

sometimes we use the accuracy rate two merge the decision tree as classifier

we use the MSE(mean square error) to measure the performance for decision tree as regressor.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

For classification problem, we use the entropy to evaluate the performance of the decision tree.

For regression problem, we use the variance/(standard deveriation) to evaluate the performance of the decision tree.

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

 

A example:

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features

Algorithm: Decision Tree, Entropy, Information Gain and Continues features