iris鸢尾花数据集最全数据分析鸢尾花数据集最全数据分析
# plot直接展示数据的分布情况,kde核密度估计对比直方图来看iris.plot()
iris.plot(kind = 'kde')
# KNNfrom sklearn import neighbors
model_fit_show(neighbors.KNeighborsClassifier(), 'neighbors.KNeighborsClassifier', X, y)
(105, 4) (45, 4) (105,) (45,)
[1 2 2 2 0 0 0 1 2 2 1 2 1 1 1 2 0 2 0 0 1 1 0 0 1 0 2 2 0 0 2 2 1 1 0 1 1
0 1 1 2 1 1 0 0] accuracy of neighbors.KNeighborsClassifier is: 0.9555555555555556
[0]
# Kmeanfrom sklearn import cluster
model_fit_show(cluster.KMeans(n_clusters = 3), 'cluster.KMeans', X, y, cluster = True)
(105, 4) (45, 4) (105,) (45,)
[1 2 1 2 2 2 2 2 0 2 2 0 0 2 0 2 0 2 2 1 0 2 1 0 2 2 0 2 0 1 2 2 0 0 2 0 1
1 1 2 0 1 2 1 0] [2]
# naive bayes# https://scikit-learn.org/dev/modules/classes.html#module-sklearn.naive_bayes# 分别是GaussianNB,MultinomialNB和BernoulliNB。#
GaussianNB:先验为高斯分布的朴素贝叶斯,一般应用于连续值# MultinomialNB:先验为多项式分布的朴素贝叶斯,离散多元值分类# BernoulliNB:先
验为伯努利分布的朴素贝叶斯,离散二值分类# ComplementNB:对MultinomialNB的补充,适用于非平衡数据from sklearn import naive_bayes
model_fit_show(naive_bayes.BernoulliNB(), 'naive_bayes.BernoulliNB', X, y)
model_fit_show(naive_bayes.GaussianNB(), 'naive_bayes.GaussianNB', X, y)
model_fit_show(naive_bayes.MultinomialNB(), 'naive_bayes.MultinomialNB', X, y)
model_fit_show(naive_bayes.ComplementNB(), 'naive_bayes.ComplementNB', X, y)