1.from math import log 2.import operator 3.from matplotlib.font_manager import FontProperties 4.import matplotlib.pyplot as plt 5. 6.""" 7.函数说明:计算给定数据集的经验熵(香农熵) 8.Parameters: 9. dataSet:数据集 10.Returns: 11. shannonEnt:经验熵 12.Modify:
时间: 2023-06-08 12:06:20 浏览: 115
find_marks_and_pose (Operator).docx
我明白了,您想了解给定数据集的经验熵,这里是代码实现:
```
from math import log
import operator
from matplotlib.font_manager import FontProperties
import matplotlib.pyplot as plt
"""
函数说明:计算给定数据集的经验熵(香农熵)
Parameters:
dataSet:数据集
Returns:
shannonEnt:经验熵
Modify:
"""
def calcShannonEnt(dataSet):
numEntries = len(dataSet) # 样本总数
labelCounts = {} # 存储每个标签出现的次数
for featVec in dataSet:
currentLabel = featVec[-1] # 当前样本的标签
if currentLabel not in labelCounts.keys():
labelCounts[currentLabel] = 0
labelCounts[currentLabel] += 1
shannonEnt = 0.0 # 初始化经验熵
for key in labelCounts:
prob = float(labelCounts[key]) / numEntries # 计算每个标签的概率
shannonEnt -= prob * log(prob, 2) # 计算经验熵
return shannonEnt
```
以上是给定数据集的经验熵计算的实现代码。
阅读全文