使用信息熵作为划分标准,对决策树进行训练 参考链接: http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier
(x_train, y_train)
| 33 | |
| 34 | |
| 35 | def predict_train(x_train, y_train): |
| 36 | ''' |
| 37 | 使用信息熵作为划分标准,对决策树进行训练 |
| 38 | 参考链接: http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier |
| 39 | ''' |
| 40 | clf = tree.DecisionTreeClassifier(criterion='entropy') |
| 41 | # print(clf) |
| 42 | clf.fit(x_train, y_train) |
| 43 | ''' 系数反映每个特征的影响力。越大表示该特征在分类中起到的作用越大 ''' |
| 44 | print 'feature_importances_: %s' % clf.feature_importances_ |
| 45 | |
| 46 | '''测试结果的打印''' |
| 47 | y_pre = clf.predict(x_train) |
| 48 | # print(x_train) |
| 49 | print(y_pre) |
| 50 | print(y_train) |
| 51 | print(np.mean(y_pre == y_train)) |
| 52 | return y_pre, clf |
| 53 | |
| 54 | |
| 55 | def show_precision_recall(x, y, clf, y_train, y_pre): |