NMI(Normalized Mutual Information)用于衡量通过特征选择得到的子集与原始数据类别之间的关联度python代码完整示例
时间: 2024-11-13 21:25:57 浏览: 33
在Python中,你可以使用`scikit-learn`库中的`normalized_mutual_info_score`函数来计算NMI。以下是一个完整的示例,假设我们有一个特征矩阵和对应的目标标签:
```python
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import mutual_info_classif
from sklearn.preprocessing import OneHotEncoder
from sklearn.metrics import normalized_mutual_info_score
import pandas as pd
# 假设你有特征数据(DataFrame)
data = pd.DataFrame(..., columns=['feature1', 'feature2', 'feature3']) # 各列是特征名
# 和对应的类别标签(一维数组)
labels = ... # 类别标签列表,例如 ['class_0', 'class_1', 'class_2']
# 将类别标签转换为one-hot编码
encoder = OneHotEncoder(sparse=False)
labels_encoded = encoder.fit_transform(labels.reshape(-1, 1)).astype(int)
# 使用SelectKBest方法选择K个特征
selector = SelectKBest(mutual_info_classif, k=5) # 这里使用了mutual_info_classif作为评分函数
features_selected = selector.fit_transform(data, labels)
# 计算NMI
selected_features_nmi = normalized_mutual_info_score(labels_encoded, selector.get_support())
print(f"NMI between selected features and original classes: {selected_features_nmi}")
阅读全文