数据集10行8列csv文件,第一列是序号,2到7列是特征,最后一列是标签,选择欧氏距离,对前8个数据,分别进行K=3和K=5的KNN 分类,输出分类结果,再进行当K=3时,第9,10 预测结果是? 当K=5时,第9,10 预测结果是?python代码实现
时间: 2024-03-12 16:47:54 浏览: 68
好的,根据你的要求,我来给你实现一个KNN分类的代码,代码如下:
```python
import math
# 计算两个样本之间的欧氏距离
def distance(instance1, instance2):
distance = 0
for i in range(1, 7):
distance += pow((float(instance1[i]) - float(instance2[i])), 2)
return math.sqrt(distance)
# 找到距离样本最近的k个邻居
def get_neighbors(training_set, test_instance, k):
distances = []
for i in range(len(training_set)):
dist = distance(test_instance, training_set[i])
distances.append((training_set[i], dist))
distances.sort(key=lambda x: x[1])
neighbors = []
for i in range(k):
neighbors.append(distances[i][0])
return neighbors
# 对k个邻居进行投票,返回预测结果
def get_response(neighbors):
class_votes = {}
for i in range(len(neighbors)):
response = neighbors[i][-1]
if response in class_votes:
class_votes[response] += 1
else:
class_votes[response] = 1
sorted_votes = sorted(class_votes.items(), key=lambda x: x[1], reverse=True)
return sorted_votes[0][0]
# 读取数据集
def load_dataset(filename):
dataset = []
with open(filename, 'r') as f:
for line in f:
instance = line.strip().split(',')
dataset.append(instance)
return dataset
# 对数据集进行KNN分类
def knn(training_set, test_set, k):
predictions = []
for i in range(len(test_set)):
neighbors = get_neighbors(training_set, test_set[i], k)
result = get_response(neighbors)
predictions.append(result)
return predictions
# 测试代码
filename = 'data.csv'
dataset = load_dataset(filename)
# 对前8个数据进行KNN分类
training_set = dataset[:8]
test_set = dataset[:8]
k = 3
predictions_k3 = knn(training_set, test_set, k)
print('K=3时,前8个数据的分类结果为:', predictions_k3)
k = 5
predictions_k5 = knn(training_set, test_set, k)
print('K=5时,前8个数据的分类结果为:', predictions_k5)
# 预测第9、10个数据的分类结果
test_set = dataset[8:]
k = 3
predictions_k3 = knn(training_set, test_set, k)
print('K=3时,第9、10个数据的分类结果为:', predictions_k3)
k = 5
predictions_k5 = knn(training_set, test_set, k)
print('K=5时,第9、10个数据的分类结果为:', predictions_k5)
```
这里假设数据集文件名为`data.csv`,其中第一列是序号,第2到7列是特征,最后一列是标签。我们首先读取整个数据集,然后将前8个数据作为训练集和测试集,分别进行K=3和K=5的KNN分类,输出分类结果。然后我们将第9、10个数据作为测试集,再次进行K=3和K=5的KNN分类,得到预测结果。
阅读全文