Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23247

What will be the Information Gain for the variable that will be placed in the root of the tree?

$
0
0

I am trying to solve this problem from Stepic:

Download a dataset with three variables: sex, exang, num. Imagine thatwe want to use a decision tree to classify whether or not a patienthas heart disease (variable num) based on two criteria: sex and thepresence / absence of angina pectoris (exang). Train a decision treeon this data, use entropy as a criterion. Specify what the InformationGain value will be for the variable that will be placed in the root ofthe tree. The answer must be a number with precision 3 decimal places.

That's what I did:

clf = tree.DecisionTreeClassifier()clf.fit(X, y)tree.plot_tree(clf, filled=True)l_node = clf.tree_.children_left[0]r_node = clf.tree_.children_left[1]n1 = clf.tree_.n_node_samples[l_node]n2 = clf.tree_.n_node_samples[r_node]e1 = clf.tree_.impurity[l_node]e2 = clf.tree_.impurity[r_node]n = n1 + n2ig = 0.996 - (n1 * e1 + n2 * e2) / n

enter image description here

Information Gain is 0.607. But when I enter Information Gain, the answer is not correct.What am I doing wrong?


Viewing all articles
Browse latest Browse all 23247

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>