MCPcopy Index your code
hub / github.com/Dod-o/Statistical-Learning-Method_Code / calcH_D_A

Function calcH_D_A

DecisionTree/DecisionTree.py:101–121  ·  view source on GitHub ↗

计算经验条件熵 :param trainDataArr_DevFeature:切割后只有feature那列数据的数组 :param trainLabelArr: 标签集数组 :return: 经验条件熵

(trainDataArr_DevFeature, trainLabelArr)

Source from the content-addressed store, hash-verified

99 return H_D
100
101def calcH_D_A(trainDataArr_DevFeature, trainLabelArr):
102 '''
103 计算经验条件熵
104 :param trainDataArr_DevFeature:切割后只有feature那列数据的数组
105 :param trainLabelArr: 标签集数组
106 :return: 经验条件熵
107 '''
108 #初始为0
109 H_D_A = 0
110 #在featue那列放入集合中,是为了根据集合中的数目知道该feature目前可取值数目是多少
111 trainDataSet = set([label for label in trainDataArr_DevFeature])
112
113 #对于每一个特征取值遍历计算条件经验熵的每一项
114 for i in trainDataSet:
115 #计算H(D|A)
116 #trainDataArr_DevFeature[trainDataArr_DevFeature == i].size / trainDataArr_DevFeature.size:|Di| / |D|
117 #calc_H_D(trainLabelArr[trainDataArr_DevFeature == i]):H(Di)
118 H_D_A += trainDataArr_DevFeature[trainDataArr_DevFeature == i].size / trainDataArr_DevFeature.size \
119 * calc_H_D(trainLabelArr[trainDataArr_DevFeature == i])
120 #返回得出的条件经验熵
121 return H_D_A
122
123def calcBestFeature(trainDataList, trainLabelList):
124 '''

Callers 1

calcBestFeatureFunction · 0.85

Calls 1

calc_H_DFunction · 0.85

Tested by

no test coverage detected